Week of 150713
WLCG Operations Call details
- At CERN the meeting room is 513 R-068.
- For remote participation we use the Vidyo system. Instructions can be found here.
General Information
- The purpose of the meeting is:
- to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
- to announce or schedule interventions at Tier-1 sites;
- to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
- to provide important news about the middleware;
- to communicate any other information considered interesting for WLCG operations.
- The meeting should run from 15:00 until 15:20, exceptionally to 15:30.
- The SCOD rota for the next few weeks is at ScodRota
- General information about the WLCG Service can be accessed from the Operations Web
- Whenever a particular topic needs to be discussed at the daily meeting requiring information from site or experiments, it is highly recommended to announce it by email to wlcg-operations@cernSPAMNOTNOSPAMPLEASE.ch to make sure that the relevant parties have the time to collect the required information or invite the right people at the meeting.
Monday
Attendance:
- local: Andrea Sciabà, Christoph Wissing (CMS), Luca Canali (IT-DB), Maarten Litmaath (ALICE)
- remote: Michael Ernst (BNL), Pavel Weber (KIT), Sang Un Ahn (KISTI), David Cameron (ATLAS), Christian (NDGF), Di Qing (TRIUMF), Lisa Giacchetti (FNAL), Tiju Idiculla (RAL)
Experiments round table:
- ATLAS reports (raw view) -
- LSF problem affecting T0 reported last week has gone
- Still problems reading data from NIKHEF: GGUS:114431. Would be nice if a DPM/FTS expert could take a look.
- CMS reports (raw view) -
- Nothing really new since last meeting on Thursday
- ALICE -
Sites / Services round table:
- ASGC:
- BNL: ntr
- CNAF:
- FNAL: ntr
- GridPP:
- IN2P3:
- JINR:
- KISTI: ntr
- KIT:
- NDGF: a full day of downtime this Thursday for an electrical power intervention in the Copenhagen site: all ALICE and ATLAS data there will be unavailable
- NL-T1:
- NRC-KI:
- OSG:
- PIC:
- RAL: ntr
- TRIUMF: ntr
- CERN batch and grid services:
- CERN storage services:
- Databases: ntr
- GGUS:
- Grid Monitoring:
- MW Officer:
AOB:
Thursday
Attendance:
- local: A. Sciabà, D. Cameron (ATLAS), Nacho (Grid and batch services)
- remote: Jeff Templon (NL-T1), Lisa Giacchetti (FNAL), Thomas Hartmann (KIT), Gareth Smith (RAL), Rob Quick (OSG), Christian (NDGF)
Experiments round table:
- ALICE -
- high activity
- CERN: LSF cap removed on Tue July 14, thanks!
- it used to be 15k and was very often reached during the last many weeks
- the concurrent jobs are often 20k+ now
- a top of 27k+ was reached for many hours!
- CNAF: tape SE has been upgraded to Xrootd 4.1.3, thanks!
- testing in progress
- disk SE will follow
- LHCb reports (raw view) -
- Half of the jobs having problem to contact a vobox at CERN. Investigating.
Sites / Services round table:
- ASGC:
- BNL:
- CNAF:
- FNAL: ntr
- GridPP:
- IN2P3: ntr
- JINR:
- KISTI:
- KIT: ntr
- NDGF: downtime tomorrow for security update on dCache. It will be a rolling update and ATLAS and ALICE data will be temporarily unavailable.
- NL-T1: ntr
- NRC-KI:
- OSG: ntr
- PIC: ntr
- RAL: ntr
- TRIUMF: ntr
- CERN batch and grid services: ntr
- CERN storage services:
- Databases:
- GGUS:
- Grid Monitoring:
- MW Officer:
AOB:
Christian mentioned that NDGF is getting frequent tickets about missing files from the ATLAS shifters, which was due to the Rucio+FTS problem reported in the previous weeks and now solved. NDGF cannot do anything about that and it's not a site problem. David said that ATLAS is aware of that and will tell shifters not to open any ticket to sites for files from a list of known missing files.