Week of 220502

WLCG Operations Call details

  • For remote participation we use Zoom: https://cern.zoom.us/j/99591482961
    • The pass code is provided on the wlcg-operations list.
    • You can contact the wlcg-ops-coord-chairpeople list (at cern.ch) if you do not manage to subscribe.

General Information

  • The purpose of the meeting is:
    • to report significant operational issues (i.e. issues which can or did degrade experiment or site operations) which are ongoing or were resolved after the previous meeting;
    • to announce or schedule interventions at Tier-1 sites;
    • to inform about recent or upcoming changes in the experiment activities or systems having a visible impact on sites;
    • to provide important news about the middleware;
    • to communicate any other information considered interesting for WLCG operations.
  • The meeting should run from 15:00 Geneva time until 15:20, exceptionally to 15:30.
  • The SCOD rota for the next few weeks is at ScodRota
  • Whenever a particular topic needs to be discussed at the operations meeting requiring information from sites or experiments, it is highly recommended to announce it by email to the wlcg-scod list (at cern.ch) to allow the SCOD to make sure that the relevant parties have the time to collect the required information, or invite the right people at the meeting.

Best practices for scheduled downtimes

Monday

Attendance:

  • remote: Kate (DB, chair), Julia (WLCG), Maarten (ALICE, WLCG), Onno (NL-T1), Alberto (monit), Pinja (sec), Jiri (ATLAS), Christoph (CMS), Borja (monit), Andrew (TRIUMF), Dave (FNAL), Daniele (CNAF), Federico (LHCb), Douglas (BNL), Ville (NDGF)

Experiments round table:

  • CMS reports (raw view) -
    • IGTF update 1.116 introduced a change for the CERN CA - causing trouble to various Java based services (due to a buggy upstream library)
      • SAM tests started to fail on the CMS dCache instance at KIT after installation. Went away after restart. (No GGUS ticket, solved on the spot via MM)
      • Similar issue in Bari (GGUS:156985) - not 100% clear, if really related to IGTF update
      • VOMS servers at CERN got trouble with CERN certificates: RQF:2026342

  • ALICE
    • added after the meeting: myproxy.cern.ch is broken (GGUS:157130)

  • LHCb
    • NTR

Sites / Services round table:

  • CERN computing services:
    • Renewal of the CERN Grid Certification Authority certificate: OTG:0070386
  • CERN storage services:
  • CERN databases: NTR
  • GGUS: NTR
  • Monitoring:
    • Distributed draft SiteMon availability/reliability reports for April 2022
  • Middleware:
    • CERN Grid CA renewal saga
      • IGTF version 1.116 was released Monday last week to
        make a new CERN Grid CA certificate available with
        its lifetime extended by another 10 years
      • The new certificate was put into production this morning
      • Despite careful testing of the new certificate beforehand,
        several incidents occurred with instances of Java services
        that make use of the canl-java security library:
        • dCache
        • StoRM
        • VOMS-Admin
        • Argus
      • Except when such a service uses the latest canl-java
        (e.g. the most recent dCache versions), such services
        all need to be restarted to get into a state that
        is consistent with clients using the new IGTF release
      • VOMS-Admin at CERN was restarted Thu evening last week,
        earlier than foreseen, in response to tickets from ATLAS and CMS
      • Argus services at CERN were restarted this morning
Douglas asked which dCache version doesn't require the upgrade? Maarten confirmed 7 family is ok, for previous versions there was a backport planned but the status of the backport is unknown

  • Networks: NTR
  • Security: Advisory sent: up to critical risk vulnerability
Maarten remarked that services using recent java versions are affected

AOB:

Edit | Attach | Watch | Print version | History: r16 < r15 < r14 < r13 < r12 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r16 - 2022-05-02 - NikolayVoytishin
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback