DRAFT

WLCG Operations Coordination Minutes, March 3, 2022

Highlights

Agenda

https://indico.cern.ch/event/1133785/

Attendance

  • local:
  • remote:
  • apologies:

Operations News

Special topics

Impact of the war in Ukraine on WLCG

Tokens & Globus update

Middleware News

Tier 0 News

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports

ALICE

  • Mostly business as usual, no major incidents
  • Run-3 preparations continuing
    • ~90% of the VOboxes switched from legacy AliEn to new JAliEn services
    • Fraction of 8-core jobs to be ramped up for Run-3 workflows
      • Most sites should only receive 8-core jobs during Run 3

ATLAS

  • Mostly smooth running with average 700k cores
    • This may go down soon with fewer opportunistic resources available (EuroHPCs and HLT farm)
  • Run 2 data and MC reprocessing campaigns effectively done, just following up few remaining problematic tasks
  • Tape challenge starting on 14 March for two weeks
  • Still issues with out of date storage reporting at dCache sites

CMS

  • running smoothly with 320-400k cores
    • usual production/analysis split of 3:1
    • up to 95k cores of non-pledged contribution (40k on average)
    • utilization of US HPC allocations on track/ahead of schedule
    • production activity mainly Run 2 ultra-legacy Monte Carlo
    • new large pile-up library being made; I/O limits of site storage reached resulting in CPU inefficiencies;
  • Tier-0 activities
    • successful large scale test (P5-->Meyrin, processing, writing to tape)
      • 9.1 GB/s processing reached (48 hour average), enough even for HeavyIon
    • cosmic ray data taking for Run 3 commissioning started
  • upgrade of HammerCloud test jobs for Run 3 software/input datasets on hold
    • need python3 version/port of HC, developer estimate: several weeks
  • WebDAV commissioning ongoing
    • SRM+WebDAV at all but one Tier-1 sites ready
    • endpoint check/commissioning at Tier-3 sites in progress
  • CMSWeb service upgraded to accept tokens
  • Token commissioning for HTCondor CEs in progress
    • waiting for HTCondor interface for ARC CEs
  • preparing to tape challenge later in March
    • successful transfer tests to PIC and FNAL
  • Thanks to all sites who made their 2022 pledge already available!

LHCb

  • Running at 150-170k cores, no major issues
    • lots of webdav transfer failures involving GridKa GGUS:156238
      • side effect due to CMS "putting storage systems to their limits"
  • Transfers from P8 to CERN Tier0 performed this week
    • more than 2PB transferred over two days
    • 1.6x nominal throughput sustained
    • some further optimisations possible from LHCb online
    • a couple of issues with CTA (unbalance between the nodes, and wrong archival reports) being followed up
    • plan is to keep data on EOS and use them as input for the next data challenge (3rd and 4th week of March)

Task Forces and Working Groups

GDPR and WLCG services

Accounting TF

  • Meeting with the experts to discuss the status of preparation for the integration of the new benchmark in the accounting workflow. Will be presented at the GDB next week

Information System Evolution TF

  • Validation of the network information in CRIC is progressing well, still ongoing.

IPv6 Validation and Deployment TF

Detailed status here.

Monitoring

  • New XRootD Monitoring components
    • XRootD Shoveler is ready to be used to send data to CERN AMQ from WLCG (non OSG) sites
    • XRootD Collector patches are being developed and tested
    • It will require to have both components ready to stablish some first test flows in (non OSG) sites
  • Agreed in the minimum required schema for transfers to be meaningful
    • Meeting with dCache developers held to discuss viability of required fields
      • Agreement for some of this fields (activity and vo) to be discussed on a higher level as "scitags" since they are needed for other purposes as well
    • Follow up discussions with other developers (XRootd, Monalisa...) to be planned/held
  • Defined first "Network Monitoring" template draft, to be fulfilled by T1s in the near future
    • First iteration will be done with AGLT to check how complete it's and needs for improvements

Network Throughput WG


WG for Transition to Tokens and Globus Retirement

Action list

Creation date Description Responsible Status Comments

Specific actions for experiments

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments

Specific actions for sites

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments

AOB

Edit | Attach | Watch | Print version | History: r15 < r14 < r13 < r12 < r11 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r13 - 2022-03-03 - BorjaGarridoBear
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback