Week of 211011
WLCG Operations Call details
- For remote participation we use Zoom: https://cern.zoom.us/j/99591482961
- The pass code is provided on the
wlcg-operations
list.
- You can contact the
wlcg-ops-coord-chairpeople
list (at cern.ch
) if you do not manage to subscribe.
General Information
- The purpose of the meeting is:
- to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
- to announce or schedule interventions at Tier-1 sites;
- to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
- to provide important news about the middleware;
- to communicate any other information considered interesting for WLCG operations.
- The meeting should run from 15:00 Geneva time until 15:20, exceptionally to 15:30.
- The SCOD rota for the next few weeks is at ScodRota
- Whenever a particular topic needs to be discussed at the operations meeting requiring information from sites or experiments, it is highly recommended to announce it by email to wlcg-scod@cernSPAMNOTNOSPAMPLEASE.ch to allow the SCOD to make sure that the relevant parties have the time to collect the required information, or invite the right people at the meeting.
Best practices for scheduled downtimes
Monday
Attendance:
- local:
- remote: Miro (Chair, DB), Jiri (ATLAS), Andrew (NL-T1), Gavin (Compute), Xavier (KIT), Christoph (CMS), Pinja (Security), Chien-De (ASGC), Henryk (LHCb), Andrew (TRIUMF), Alberto (Monitoring), Doug (BNL), Darren (NDGF), Maarten (ALICE), Josep (PIC), DaveM (FNAL), Francesco (CNAF), Borja (Monitoring)
Experiments round table:
- ATLAS reports (raw view) -
- Global data challenge last week
- Run 2 reprocessing continues
- Tape data challenge this week
- CMS reports (raw view) -
- Network throughput challenge
- A few unforeseen limitations (likely mostly due to usage of *_Test-RSEs)
- More complete evaluation still to come
- Tape challenge
- User job submission tool CRAB being switched to use WebDAV, moving away from SRM/GridFTP
- ALICE
- Mostly business as usual.
- The tape challenge was started ~09:00.
- ALICE rates can be seen here.
- LHCb reports (raw view) -
- Activity:
- Tape data challenge started today
- Issues:
Sites / Services round table:
- ASGC: Upgraded to ARC6
- BNL: Report to be added
- CNAF: NTR
- EGI:NC
- FNAL: NTR
- IN2P3: NTR
- JINR: NTR
- KISTI: NC
- KIT: NTR
- NDGF: NTR
- NL-T1:
- Nikhef: We have been having problems with one of our core network routers over the last week. The router failed on 6th October causing a half day outage and then the routing engine on the router had to be changed over to the backup routing engine on 8th October which caused an hours outage. At the moment the router is stable and investigations are on going. There will most likely be downtime for a firmware upgrade on the router in the future.
- NRC-KI: NC
- OSG: NC
- PIC: NTR
- RAL: NTR
- TRIUMF: NTR
- CERN computing services: NTR
- CERN storage services:
- (non-WLCG service): CERN AFS will be inaccessible from non-CERN networks from Oct 25th- Nov 1st, see OTG:0066507. Please contact the CERN AFS team (ticket) if this affects your site or experiment.
- CERN databases: NTR
- GGUS: NTR
- Monitoring: NTR
- Middleware: NTR
- Networks: NTR
- Security: NTR
AOB: