TWiki
>
LCG Web
>
WLCGCommonComputingReadinessChallenges
>
WLCGOperationsWeb
>
WLCGOpsCoordination
>
WLCGOpsMinutes220303
(2022-03-08,
MaartenLitmaath
)
(raw view)
E
dit
A
ttach
P
DF
<!-- -- > <font size="6"> %RED% *DRAFT* %BLACK% </font> <br /><br /> <!-- --> ---+!! WLCG Operations Coordination Minutes, March 3, 2022 %TOC{depth="4"}% ---++ Main points * [[LCG/WLCGOpsMinutes220303#Impact_of_the_war_in_Ukraine_on][Impact of the war in Ukraine on WLCG]] * [[LCG/WLCGOpsMinutes220303#Tokens_Globus_update][Tokens & Globus update]] * [[https://indico.cern.ch/event/1096039/][Pre-GDB on operations effort]] took place on the 24th of February. <br/> Well attended. Summary will be presented in the [[https://indico.cern.ch/event/1096028/][GDB]] on March 9. ---++ Agenda https://indico.cern.ch/event/1133785/ ---++ Attendance * local: * remote: Alessandra D (Napoli), Andrea (WLCG), Andrew (TRIUMF), Borja (monitoring), Christoph (CMS), Concezio (LHCb), David Cameron (ATLAS + ARC), David Cohen (Technion), Eric (!IN2P3), Giuseppe (CMS), Julia (WLCG), Maarten (ALICE + WLCG), Matt (Lancaster), Max (KIT), Miltiadis (WLCG), Panos (WLCG), Pepe (PIC), Romain (WLCG), Shawn (AGLT2 + networks), Simone (WLCG), Stephan (CMS), Thomas (DESY) * apologies: ---++ Operations News * the next meeting is planned for April 7 ---++ Special topics ---+++ Impact of the war in Ukraine on WLCG see the [[https://indico.cern.ch/event/1133785/#9-impact-of-the-war-in-ukraine][presentation]] ---++++ Discussion * Simone: * the CERN Council will meet on March 8, more news after that * Thomas: * what will the mailing list be used for? * Simone: * the list is there for sites to inform us about <br/> policies by which they have to abide and <br/> for advice on how to implement them * the list is not for sites to get news; <br/> other channels will be used for that * Romain: * matters can also be escalated via security contacts ---+++ Tokens & Globus update see the [[https://indico.cern.ch/event/1133785/#7-tokens-globus-update][presentation]] ---++++ Discussion * Thomas: * what about other VOs in EGI? * might we need to set up ARC CEs to keep supporting X509 for them? * Maarten: * EGI have launched a survey for other VOs to consider these matters * some may imitate WLCG and move to IAM * others may switch to EGI Check-in * grid services will need to support multiple token providers * if those providers share a common basis, that should not be difficult * _after the meeting:_ * the !AuthZ WG has EGI Check-in reps * the common basis is the =AARC Blueprint Architecture= * Thomas: * VOs like ILC and Belle-II use DIRAC and hence should be fine? * Maarten: * indeed, as DIRAC will be made to work for LHCb anyway * DIRAC also is the default framework for small VOs in EGI * Concezio: * a DIRAC release supporting tokens is in the making * David Cameron: * HTCondor-G can still use X509 with the ARC CE REST interface * _after the meeting: the slides have been corrected_ * will the development for the longer token lifetimes be ready on time? * Maarten: * to be followed up in the !AuthZ WG ---++ Middleware News * Useful Links * WLCGBaselineTable * Baselines/News ---++ Tier 0 News ---++ Tier 1 Feedback ---++ Tier 2 Feedback ---++ Experiments Reports ---+++ ALICE * Mostly business as usual, no major incidents * High analysis activity in preparation for [[https://indico.cern.ch/event/895086/][Quark Matter 2022]], April 4-10 * Run-3 preparations continuing * ~90% of the VOboxes switched from legacy =AliEn= to new =JAliEn= services * Fraction of 8-core jobs to be ramped up for Run-3 workflows * Most sites should only receive 8-core jobs during Run 3 ---+++ ATLAS * Mostly smooth running with average 700k cores * This may go down soon with fewer opportunistic resources available (EuroHPCs and HLT farm) * Run 2 data and MC reprocessing campaigns effectively done, just following up few remaining problematic tasks * Tape challenge starting on 14 March for two weeks * Still issues with out of date storage reporting at dCache sites ---++++ Discussion * Julia: * the dCache SRR instabilities are being looked into ---+++ CMS * running smoothly with 320-400k cores * usual production/analysis split of 3:1 * up to 95k cores of non-pledged contribution (40k on average) * utilization of US HPC allocations on track/ahead of schedule * production activity mainly Run 2 ultra-legacy Monte Carlo * new large pile-up library being made; I/O limits of site storage reached resulting in CPU inefficiencies; * Tier-0 activities * successful large scale test (P5-->Meyrin, processing, writing to tape) * 9.1 GB/s processing reached (48 hour average), enough even for HeavyIon * cosmic ray data taking for Run 3 commissioning started * upgrade of HammerCloud test jobs for Run 3 software/input datasets on hold * need python3 version/port of HC, developer estimate: several weeks * WebDAV commissioning ongoing * SRM+WebDAV at all but one Tier-1 sites ready * endpoint check/commissioning at Tier-3 sites in progress * CMSWeb service upgraded to accept tokens * Token commissioning for HTCondor CEs in progress * waiting for HTCondor interface for ARC CEs * preparing to tape challenge later in March * successful transfer tests to PIC and FNAL * Thanks to all sites who made their 2022 pledge already available! ---+++ LHCb * Running at 150-170k cores, no major issues * lots of webdav transfer failures involving GridKa GGUS:156238 * side effect due to CMS "putting storage systems to their limits" * Transfers from P8 to CERN Tier0 performed this week * more than 2PB transferred over two days * 1.6x nominal throughput sustained * some further optimisations possible from LHCb online * a couple of issues with CTA (unbalance between the nodes, and wrong archival reports) being followed up * plan is to keep data on EOS and use them as input for the next data challenge (3rd and 4th week of March) ---+++ Discussion * Stephan: * regarding the fallout from CMS activities: * the production team were too optimistic about how many <br/> of those "heavy" jobs could be run concurrently, sorry! * the last ones should finish by the weekend * such productions will be controlled better from now on ---++ Task Forces and Working Groups ---+++ GDPR and WLCG services * Good progress in publishing CERN RoPOs: CTA, EOS, CVMFS, FTS and central monitoring services * [[GDPRandWLCG][Updated list of services]] ---+++ Accounting TF * Meeting with the experts to discuss the status of preparation for the integration of the new benchmark in the accounting workflow. Will be presented at the GDB next week ---+++ Information System Evolution TF * Validation of the network information in CRIC is progressing well, still ongoing. ---+++ IPv6 Validation and Deployment TF Detailed status [[WlcgIpv6#IPv6Depl][here]]. ---+++ Monitoring * New !XRootD Monitoring components * !XRootD Shoveler is ready to be used to send data to CERN AMQ from non-OSG sites * !XRootD Collector patches are being developed and tested * It will require having both components ready to establish some first test flows in non-OSG sites * Agreed on the minimum required schema for transfers to be meaningful * Meeting with dCache developers held to discuss viability of required fields * Agreement for some of these fields (activity and vo) to be discussed on a higher level as "scitags" since they are needed for other purposes as well * Follow up discussions with other developers (!XRootD, !MonALISA, ...) to be planned/held * Defined first "Network Monitoring" template draft, to be filled by T1s in the near future * First iteration will be done with AGLT2 to check how complete it is and needs for improvements ---+++ Network Throughput WG %INCLUDE{ "NetworkTransferMetrics" section="03032022" }% ---+++ WG for Transition to Tokens and Globus Retirement see the [[LCG/WLCGOpsMinutes220303#Tokens_Globus_update][special topic]] ---++ Action list %INCLUDE{ "WLCGOpsCoordActionList" }% ---++ AOB
E
dit
|
A
ttach
|
Watch
|
P
rint version
|
H
istory
: r15
<
r14
<
r13
<
r12
<
r11
|
B
acklinks
|
V
iew topic
|
WYSIWYG
|
M
ore topic actions
Topic revision: r15 - 2022-03-08
-
MaartenLitmaath
Log In
LCG
LCG Wiki Home
LCG Web Home
Changes
Index
Search
LCG Wikis
LCG Service
Coordination
LCG Grid
Deployment
LCG
Apps Area
Public webs
Public webs
ABATBEA
ACPP
ADCgroup
AEGIS
AfricaMap
AgileInfrastructure
ALICE
AliceEbyE
AliceSPD
AliceSSD
AliceTOF
AliFemto
ALPHA
Altair
ArdaGrid
ASACUSA
AthenaFCalTBAna
Atlas
AtlasLBNL
AXIALPET
CAE
CALICE
CDS
CENF
CERNSearch
CLIC
Cloud
CloudServices
CMS
Controls
CTA
CvmFS
DB
DefaultWeb
DESgroup
DPHEP
DM-LHC
DSSGroup
EGEE
EgeePtf
ELFms
EMI
ETICS
FIOgroup
FlukaTeam
Frontier
Gaudi
GeneratorServices
GuidesInfo
HardwareLabs
HCC
HEPIX
ILCBDSColl
ILCTPC
IMWG
Inspire
IPv6
IT
ItCommTeam
ITCoord
ITdeptTechForum
ITDRP
ITGT
ITSDC
LAr
LCG
LCGAAWorkbook
Leade
LHCAccess
LHCAtHome
LHCb
LHCgas
LHCONE
LHCOPN
LinuxSupport
Main
Medipix
Messaging
MPGD
NA49
NA61
NA62
NTOF
Openlab
PDBService
Persistency
PESgroup
Plugins
PSAccess
PSBUpgrade
R2Eproject
RCTF
RD42
RFCond12
RFLowLevel
ROXIE
Sandbox
SocialActivities
SPI
SRMDev
SSM
Student
SuperComputing
Support
SwfCatalogue
TMVA
TOTEM
TWiki
UNOSAT
Virtualization
VOBox
WITCH
XTCA
Welcome Guest
Login
or
Register
Cern Search
TWiki Search
Google Search
LCG
All webs
Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use
Discourse
or
Send feedback