remote: Alberto (monitoring), Andrew (TRIUMF), Borja (monitoring), Concezio (LHCb), Dave (FNAL), David (Technion), Felix (ASGC), Gavin (T0), Giuseppe (CMS), Horst (Oklahoma), Johannes (ATLAS), Maarten (ALICE + WLCG), Matt (Lancaster), Nikolay (monitoring), Pedro (monitoring), Stephan (CMS), Vincent (security)
Stable Grid production with up to ~380k concurrently running grid job slots with the usual mix of MC generation, simulation, reconstruction, derivation production and user analysis, including ~45k slots from the HLT/Sim@CERN-P1 farm and ~15k slots from Boinc. Occasional additional peaks of 200k job slots from HPCs.
Continuing with about 60k job slots used for Folding@Home jobs since 4 April. 50% from ~55 different grid sites via opt-in and 50% at CERN-P1
No other major other issues apart from the usual storage or transfer related problems at sites
Finishing grand unification of production+analysis queues in PanDA in the next days.
All systems recovered quickly from Oracle/DBonDemand downtime last Saturday - would appreciate to avoid such downtimes over the weekend next time
CTA in production for ATLAS since Monday - still fixing some issues in Rucio/middleware
CMS
Covid-19 compute contributions being returned to experiment use
main processing activities:
Run 2 ultra-legacy Monte Carlo
Run 2 pre-UL Monte Carlo
migration to Rucio ongoing
production of nanoAOD samples configured for PhEDEx being bumped up to complete more quickly
LHCb
still running F@H on part of HLT farm
large MC requests coming up so we are going to reduce this Covid-19-related activity
processing (small) samples of lead-lead collisions and lead-neon fixed target collisions
grid drained in preparation for the CERN Oracle/DBOD outage of last Saturday, DIRAC services and agents switched off, then on again after the outage, everything went extremely smoothly
Discussion on F@H reductions
Maarten: it is perfectly defensible to ramp down resources for F@H, as we have already done a lot and we cannot neglect our own duties