-- HarryRenshall - 16 Jan 2008

Week of 080121

Open Actions from last week: Announce the supported middleware versions for the Feb run at the Monday weekly operations meeting.

Monday:

see the weekly phone conference in Indico

Tuesday:

Experiment report(s):

ALICE: (PM) One SL4 VO-vox is already in production at CERN and sites at RAL, in Mexico and in Russia are currently migrating. All ALICE Tier1 should have SL4 VO-boxes by the February startup.

LHCb: (HR) There is a meeting happening now discussing configuring their Castor Tier1 sites for CCRC'08.

CMS: (DB) [ apologies: limited attendance to these meetings this whole week due to participation to CMS DM/WM workshop in Lyon]. --- Discussion on megatable figures from which to extract CCRC transfer goals almost finalized. --- Polling T2s (supporting the CMS VO) on SRMv2 deployment plans: in progress. We are counting which T2s we can have with SRMv2, and plan transfer tests accordingly.

Core services (CERN) report: A fix to the current gfal problem is being targeted for next Monday.

DB services (CERN) report:

  • The patch for the streams bug which has affected the ATLAS streams set-up since last Tuesday has been delivered by Oracle this morning and has been applied to the source database of the ATLAS online RAC (ATONR). It is a non-rolling patch and therefore a service interruption of one hour was necessary for ATONR. The streams capture process has been restarted. (The bug was observed after a procedure to compact and compress the PVSS archived data, developed in collaboration with ATLAS and IT-CO).

Monitoring / dashboard report:

Release update:

Questions from sites/experiments:

Question by CMS: should we expect a regular review by WLCG on SRMv2 deployment status at Tier-2s in the next days, approaching the deadline?

BNL: (ME) A concern is access to ATLAS conditions data by Tier1 reprocessing jobs. Various placement models are being looked at. JDS asked if their space requirements are clear to which the reply was Yes but configuration is not complete.

AOB: (JDS) We are going to document known issues and any workarounds. We will also be discussing high level monitoring of ccrc'08. I am thinking of a weekly report based on the 3 types of metrics - experiment scaling factors, experiment defined critical services and MoU targets.

Wednesday

Experiment report(s):

ALICE: (PM) Pilot sites are continuing migration to the SL4 VO-box.

LHCb: (RS) A new bug has been found in the lcg-cp CLI where it no longer accepts a user defined number of streams. A fix has already been committed to CVS. This matters more for LHCb production than for CCRC'08. When asked RS thought they could use the fixed version from the applications area for now then look again next Monday.

Core services (CERN) report: (MCdS) Today's CASTOR central services upgrades (Oracle security patch and network reconfiguration) are finished.

DB services (CERN) report: (MG) The ORACLE patch for the ATLONR streams replication bug has been deployed. Tomorrow CNAF will migrate the LHCb conditions database to new hardware. There will be a long downtime at NIKHEF for a machine room move.

Monitoring / dashboard report:

Release update: (ST) fts is going to the pps now and lfc on 32-bit slc4 will go tomorrow. gfal will be released tomorrow. dcache.org have announced two patches on the 1.8.0-12 baseline release but they are not visible yet.

Questions from sites/experiments: JS asked if sites had started installing the baseline software versions released on Monday. RAL has started testing, GRIF is already there and PIC is aware of the list.

AOB: JS reported two new pages on the CCRC'08 Twiki. One is for site concerns and the other is for known software 'features' that are unlikely to be fixed for the February run. The intention is to document any workarounds. These items will be discussed at next Monday's meeting. He also said the MB wants to see metrics in all areas to be used in the experiment functional tests and that WLCG will define such metrics if the experiment does not. He will try and produce a list of areas where there is currently no known metric.

Thursday

Experiment report(s): ALICE: (PM) PM will be testing srm-alice.cern.ch with ALICE use cases.

CMS: (DB) We are continuing to work out the requirements numbers for ccrc08 phase 1. We are also trying to understand which Tier2 can be involved and what to do for those that cannot deploy srm 2.2 in time. We have appointed a European and an OSG Tier2 coordinator. MJ reported that the GRIF Tier2 will be ready for CMS.

ATLAS: (SC) NL-T1 will be ready with their full configuration (they are currently moving machine room). ATLAS will be running a large Monte Carlo production (to produce 3 times last years data in 2 months) concurrently with ccrc08 phase1. The disk space for this was already included in the ATLAS resource planning.

LHCB: (RS) Have now published their space requirements (see links on https://twiki.cern.ch/twiki/bin/view/LHCb/CCRC08) and requested Tier1 to report their status. There was a discussion on how best to document this and the feeling was that sites should do it. The LHCb cern srm server became overloaded by several hundred file merging jobs.

Core services (CERN) report: (MC-S) Castor CMS was upgraded to 2.1.6-7 today. We are looking at upgrading the other experiments next week.

DB services (CERN) report: (MG) Oracle security patches have been applied on the CMS integration RAC and others will be done tomorrow.

Monitoring / dashboard report:

Release update: (NT) A patch for the lcg-cp to a classic SE bug is on the way.

Questions from sites/experiments:

AOB: JS announced that WLCG wishes to identify official Tier2 regional coordinators and that they would also be publishing a calender of events (e.g. for future releases and ccrc events). SC reported that ATLAS had had a shift workshop and were thinking of the best methods for publishing daily reports. JS added that WLCG should refresh its list of Tier1 hot-line contacts.

Friday

Experiment report(s): ATLAS: (SC) asked Castor operations to check if write access to the CERN CAF disk pool is now restricted.

Core services (CERN) report: Castor operations is monitoring the CMS instance running the new 2.1.6-7 software with a view to scheduling upgrades to the other experiments next week.

DB services (CERN) report: The CMS Oracle upgrade has gone well. NL-T1 is now down till next Tuesday and on Monday CNAF will move the LHCb LFC to a new machine. The availibility of streams replication has now been added to the CERN SLS status display.

Monitoring / dashboard report:

Release update:

Questions from sites/experiments:

AOB: FD reported the issue of FTS file copies between an srmv1 site and an srmv2 dcache site not seeing the PERMANENT flag is fixed by the last two patch releases (1 and 2) to dcache 1.8.0-12. She knew some sites had configured their FTS channels to use url_copy without any loss of performance. AS announced that CNAF have developed an improved WMS monitoring that they are now packaging for other sites. HR announced the availibility of the calender of events (at https://twiki.cern.ch/twiki/bin/view/LCG/CCRC08Calendar) and JS requested experiments to add any additional planned activities of theirs.

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2008-01-25 - HarryRenshall
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback