WLCG Operations Coordination Minutes, Feb 2, 2023

Highlights

Agenda

https://indico.cern.ch/event/1248250/

Attendance

  • local:
  • remote: David B (IN2P3-CC), David Cameron (ATLAS), Stephan (CMS), Borja (WLCG Monitoring), David Cohen (IL T2 sites), David Schindrer, Andrea (TRIUMF), Wenjing (AGLT2), David (FNAL), Eric (IN2P3), Panos (WLCG), Petr (ATLAS), Shawn (WLCG Monitoring, UMICH), Henryk (LHCb + NCBJ), Horst (OU ATLAS T2), Cedric (CERN), Chris (T2_TW_NCHC), Sebastien (KIT), Phillippe Laurence, Julia (WLCG)
  • apologies: Maarten (WLCG & ALICE), Concezio (LHCb)

Operations News

  • We are looking for volunteer sites running XrootD storage to test new components for the XrootD monitoring infrastructure
  • We plan to decommission CERN BDII by the beginning of March.

Special topics

XrootD monitoring status

David Cameron: How VO is identified in the monitoring message?

Borja: Normally the monitoring stream contains the name of the VO, however it depends on the site configuration. In some cases VO name is missing.

What about authentication? For authentication the clients are recommended to use use robot certificate, since not all sites have robot certificates, host certificates should be also fine.

What about token authentication?

Julia: We are discussing token authentication as a requirement for ActiveMQ messaging service at CERN.

Borja: For testing purposes for the time being basic password authentication can be used

David C: What about dCache with xRootD?

Julia: We have agreed on a common set of attributes to be reported from all primary monitoring data producers. This includes dCache. Now we are discussing implementation with the dCache developers. Since FNAL is hosting pileup which is the main source for remote reading by CMS jobs via xrootd protocol, it was discussed at the WS in Lancaster that we need to enable FNAL for xrootd monitoring with the highest priority. Whom we should ping at FNAL to work together to test the new flow with dcache+xrootd use case.

Stephan L: Should be dCache support team at FNAL. After the meeting Stephan sent Julia the name of the contact person.

Site survey about Linux OS experience and plans

What was the purpose of the survey?

Julia: To collect input from the sites regarding their current experience, concerns, requests. To understand whether the proposal of FNAL and CERN is in line with current site experience.

Middleware News

Tier 0 News

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports

ALICE

  • Lowish to normal activity on average in the last weeks
  • More sites have been switched from single- to 8-core pilots
    • Other sites are planned to follow in the coming weeks

ATLAS

  • Mostly smooth running with 500-700k slots on average. Briefly went over 1 million total running cores last week with combined contribution from a few HPCs.
  • No noticeable effects from any energy issues over Xmas break
  • Job monitoring in MONIT was broken from 30 Dec to 5 Jan, thanks to the MONIT team for repairing the missing data
  • Experimenting with non-SRM access to tape: direct HTTPS writes at FZK and REST API at BNL

CMS

  • smooth running, especially over the holidays
    • so well, we completed submitted jobs at the end of the holidays
    • good core usage between 280k and 395k cores
    • usual production/analysis split of between 3:1 and 8:1
    • significant contribution from HPCs peaking at over 55k
    • main production activity Run 2 ultra-legacy Monte Carlo and Run 3
  • Many Thanks to OSG team to update 3.5 before the holidays and fix a hidden CS8/SRM issue for us
    • a remaining issue discovered and reported to gfal2 team
  • transfers to tape at JINR backing up transfers, suboptimal tape family setup, working with site to suggest better config
  • waiting on python3 version/port of HammerCloud
  • working with our DPM sites to migrate to other storage technology
    • limited by operations manpower/expertise
  • token migration progressing steadily
    • setting IAM permissions/groups ongoing
    • native xrootd config ready, working on dCache config, starting on EOS
    • uncovered an issue in latest xrootd client, waiting on a fix
    • new SAM xrootd probe tests metadata access is restricted to CMS members, made tickets for about ten sites where only data access is restricted
  • looking forward to 24x7 production IAM support by CERN

LHCb

  • smooth running
    • somewhat lower activity due to temporary reduction of simulation production requests
  • data transfer tests with CNAF ongoing
  • defining milestones and data challenges for new proto-Tier1 sites at NCBJ Warsaw and IHEP-Beijing
  • new major DIRAC release deployed
    • no CE token support yet

Task Forces and Working Groups

GDPR and WLCG services

Accounting TF

  • Main focus is to ensure readiness for switching to HEPScore benchmark in the accounting workflow by the 1st of April. Regular meetings with APEL developers to follow up on the progress as well as dedicated meetings with involved parties (CERN experts for T0 accounting, OSG accounting system developers, EGI operations)

Information System Evolution TF

  • All CRIC APIs which contain private information has been put behind authentication

IPv6 Validation and Deployment TF

Detailed status here.

Monitoring

  • XRootD
    • Status update delivered for XRootD improvements
      • Requested sites to deploy new shoveler to start testing more flows
  • Network Monitoring
    • BNL preparing the information, should be available by end of the month
    • Will need to follow with other sites (campaign?), probably to start with T1s

Network Throughput WG


WG for Transition to Tokens and Globus Retirement

Action list

Creation date Description Responsible Status Comments

Specific actions for experiments

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments

Specific actions for sites

Creation date Description Affected VO Affected TF/WG Deadline Completion Comments

AOB

  • Next meeting is scheduled for the 2nd of March
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r11 - 2023-02-07 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback