WLCG Operations Coordination Minutes, July 5th, 2018

Highlights

Agenda

Attendance

  • local: Julia (WLCG), Maarten (ALICE + WLCG), Mayank (WLCG), Renato (LHCb)
  • remote: Alessandro (CNAF), Catherine (LPSC + IN2P3), Dimitrios (WLCG), Eric (IN2P3-CC), Gareth (RAL), Johannes (ATLAS), Stephan (CMS)
  • apologies:

Operations News

  • the next meeting will be on Sep 13
    • please let us know if that date would pose a significant problem

Special topics

  • Some follow up regarding GDPR and WLCG services. This topic has been discussed at the June MB. Latest proposal consists of the following:
    • Produce a light-weight “Code of Conduct” for WLCG (and EGI/EOSC-hub). This implies replacement of the existing EGI/WLCG Data Protection Policy Framework and providing a general WLCG Data Privacy Statement and a template for others to use.
    • The EOSC-hub/AARC2/WLCG policy team will prepare draft documents (expected to be approved early autumn)
    • WLCG Ops should continue building their list of services needing a Privacy Statement

Middleware News

Discussion

  • Maarten: EGI and OSG have just sent a security advisory concerning Singularity

Tier 0 News

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports

ALICE

  • Normal to high activity levels on average
  • No major problem

ATLAS

  • Stable grid production over the last weeks with up to ~300-350k concurrently running job slots. Additional HPC contributions with peaks of 100k concurrently running job slots.
  • There is the usual mix of grid workflows on-going: MC generation, simulation and data and MC derivation production. MC reconstruction is currently at a smaller scale with a larger campaign planned to start in August or September.
  • Upcoming is the first larger scale test of MC pile-up simulation and digitisation with MC overlay.
  • Commissioning of the Harvester submission system via PanDA is on-going: US HPCs, Grid: CERN, BNL, Iberian cloud
  • EOSATLAS: there are worries from ATLAS on the EOS stability.
    • Several short (20mins-1h) EOS issues in the past month.
    • Waiting for a post-mortem report (as Twiki with ServiceIncidentReports ) of the current EOS instabilities

  • Julia: we will ask the EOS team to give a presentation in our next meeting

CMS

  • LHC in beta*=90m run
  • Tier-1 keeping up with incoming data
  • compute system busy at about 250k cores
    • usual mix of about 20% analysis 80% production
  • CMS EOS crash last week triggered by an eosdump of one of our legacy cleaning scripts

LHCb

  • Productions:
    • Collision18 production ongoing
    • User and Simulations running
  • No major problems

Ongoing Task Forces and Working Groups

Accounting TF

  • Following the question of Di at the last WLCG Operations Coordination meeting regarding accounting for jobs submitted via BOINC the accounting task force twiki page has been updated with short instruction from Andrew McNab. Di is going to try and will report his experience at the September Accounting Task Force meeting.

Archival Storage WG

Update of providing tape info

PLEASE CHECK AND UPDATE THIS TABLE

Site Info enabled Plans Comments
CERN YES    
BNL YES    
CNAF YES   Space accounting info is integrated in the portal. Other metrics are on the way
FNAL YES    
IN2P3 YES   Space accounting info is integrated in the portal. Other metrics are on the way
JINR YES    
KISTI NO   KISTI has been contacted. Will enable it soon
KIT YES    
NDGF NO   NDGF has a distributed storage which complicates the task. Discuss with NDGF possibility to do aggregation on the storage space accounting server side
NLT1 NO    
NRC-KI YES    
PIC YES    
RAL YES   Space accounting info is integrated in the portal. Other metrics are on the way
TRIUMF YES    

  • Julia: we will contact SARA by e-mail

One can see all sites integrated in storage space accounting for tapes here

  • Dimitrios: mind that the plots may show some jagged lines due to recent network issues

  • Julia: we will soon move the prototype to production

Information System Evolution TF

  • NTR

IPv6 Validation and Deployment TF

Detailed status here.

Machine/Job Features TF

Monitoring

MW Readiness WG

Network Throughput WG


Squid Monitoring and HTTP Proxy Discovery TFs

  • Just a measurement: CMS@Home is now using the Cloudflare caching (openhtc.io) and measurements show nearly 5 minutes of savings in job startup time on average.

Traceability WG

Container WG

Action list

Creation date Description Responsible Status Comments
03 Nov 2016 Review VO ID Card documentation and make sure it is suitable for multicore WLCG Operations In progress GGUS:133915
07 Jun 2018 Followup of OSG service URL changes WLCG Operations Ongoing We suggest that for all middleware using various OSG-related URLs the experiments look at this page and inform operations in case you need more help
07 Jun 2018 GDPR policy implementation across WLCG and experiment services WLCG Operations + experiments Ongoing  

Specific actions for experiments

Creation date Description Affected VO Affected TF/WG Comments Deadline Completion

Specific actions for sites

Creation date Description Affected VO Affected TF/WG Comments Deadline Completion

AOB

Edit | Attach | Watch | Print version | History: r12 < r11 < r10 < r9 < r8 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r12 - 2018-07-10 - RenatoSantana
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback