WLCG MW Readiness WG 6th meeting Minutes - October 1st, 2014

Agenda

Attendance

  • Local: Alberto Aimar (CERN-IT/SDC management), Vincent Brillault (CERN Computer Security Group, Pakiti expert), David Cameron (ATLAS), Simone Campana (ATLAS), Lionel Cons (Monitoring expert, developer), Maria Dimou (chair & notes), Maarten Litmaath (ALICE & notes), Andrea Manzi (MW Officer, DPM expert), Alberto Peon (T0), Stefan Roiser (LHCb), Andrea Sciaba` (CMS & WLCG Ops Coord).
  • Remote: Maria Alandes Pradillo (WLCG Ops Coord co-chairperson), Cristina Aiftimiei (EMI), Stephen Burke (UK), Joel Closier (LHCb), Jeremy Coles (GridPP), Mario David (LIP), Daniel Kouril (Pakiti expert), Joao Pina (EGI Staged Rollout manager).
  • Apologies: Massimo Sgaravatto (Legnaro).

Minutes of previous meeting

The minutes of the last (5th) meeting HERE were approved.

Summary

  • The experience developed from the Readiness verification of DPM, the CREAM CE and BDII at different Volunteer sites was well documented and paved the path to now test the next DPM version and more products from our shortlist starting with dCache, Storm, xrootd and FTS3. This will entail the involvement of more Volunteer sites and the completion of relevant experiment workflows' documentation.
  • A database (DBoD for now) was designed by the MW Officer Andrea M. and the Package Reporter developer Lionel to store the verification results.
  • ATLAS asked the MW Readiness WG to be involved in the HTCondor testing for various CE types. CMS take care of such testing themselves.
  • LHCb will test the VOMS client on behalf of the MW Readiness WG.
  • The Tier0 will participate in the MW Readiness effort by testing EOS and FTS3.
  • The developer of the MW Package Reporter presented its design, the number of hosts and sites it now runs, the alternatives being examined for interoperability with Pakiti. Pakiti expert Daniel Kouril was also connected to the meeting.
  • The next meeting will take place on November 19th at 4pm CET,

MW Officer report

Slides on the agenda. Related discussion:
  • ATLAS populate atlas.cern.ch CVMFS area from tar balls. They will not use grid.cern.ch, as it would be tricky to integrate with the ATLAS SW.
  • LHCb do use grid.cern.ch. They commit to participate in the MW Readiness effort by testing the VOMS client.
  • The DPM pilot for the MW Readiness verification went well and its procedures can now be adapted for dCache and StoRM. The monitoring links for tracking the DPM verification worked OK.
  • The workflow for CREAM verification is not yet complete.
  • Only Edinburgh, Legnaro, GRIF were active so far, but other Volunteer sites will naturally be involved for other products in the list.
  • The successful verification of a new version may have implications for the baseline of the given product.

WLCG Package Reporter

Slides on the agenda. Related discussion:
  • Lionel (the MW Package Reporter developer) prefers option A2 (see slides 5 and 6 HERE for the alternative scenaria) for joining the Package Reporter to Pakiti. This option is using the WLCG Package Reporter as the only client and using the current WLCG Package Database to store all the package information. Pakiti therefore becomes a front end on top of the WLCG Package Database.
  • Daniel (the Pakiti expert) has no preference yet, all 4 options will be evaluated, starting from A1.
  • Lionel, Andrea and the Pakiti team will soon decide the chosen option (Action 20141001-02).
  • The different options have different scalability concerns and implications on the long-term support by WLCG and the Pakiti team. Vincent (the security expert) pointed out that any of the solutions that will eventually be adopted should come with ensured support. Because, when the 2 projects are joined, each will depend on the other and we want to avoid another support saga like we are having for Argus.
  • While each worker node will get the MW via CVMFS, it still needs to report its rpms to Pakiti. The rpm reporting frequency is daily.
  • The Tier0 grid services shall wait for a single solution that is acceptable to the security team. Nevertheless, the MW Readiness WG needs to continue with deployment of the current Package Reporter to gain operational experience, even if at some point we may have to replace it with a new version for the joint goals. The Package Reporter will allow us to see what is installed where, compare versions with the baseline, compare them with what has been verified by a given experiment, and produce various reports. We will also be able to match operational issues with certain versions that are found at affected sites.
  • The visualisation is being worked on (Action 20140702-06).

Sites' feedback

Only a few sites were involved so far. They just install the MW as usual, possibly with some manual tweaking because of the special status of the affected services.

Discussion on HTCondor verification

  • ATLAS do not want to track releases of the different CE types: that should be done by the MW Officer (Action 20141001-01).
  • A small testbed is needed to allow for continuous testing of all the components involved: HTCondor-G (pilot factory), CREAM, ARC, HTCondor CE.
  • An ATLAS expert will run the test pilot factory and upgrade HTCondor-G when a new release is announced e.g. by the MW Officer.
  • At least one friendly ATLAS site is needed per CE type and it should upgrade its test CE when a new release is announced e.g. by the MW Officer.
  • CMS have not expressed interest in a similar setup for them. Their position: The testing that OSG and in particular the glideinWMS developers do, is enough. CMS experts discuss with them which version should be deployed on the pilot factories, etc. From this point of view, HTCondor is seen as an experiment service, because of course its usage as batch system is not tested by CMS or the glideinWMS team. So, the feeling is that the current interaction between CMS and the HTCondor team is good enough and there is no strong motivation to set up any new testing system as we do in for the MW Readiness verification of other services.

Actions

  • 20141001-02: Lionel, Andrea M and the Pakiti team to decide the MW Reporter-Pakiti join option. NEW!
  • 20141001-01: Andrea M to enroll in the condor-announce mailing list (htcondor-world@csNOSPAMPLEASE.wisc.edu), inform Napoli, which tests the CREAM CE to install HTCondor and test the two together. NEW!
  • 20140702-06 Andrea M & Lionel Discuss the visualization of testing results. On-going
  • 20140702-05 Volunteer Sites Install the WLCG MW Package Reporter and report on the clarity of the instructions. Done. Feedback was given and reflected in the code.
  • 20140702-04 Andrea M. Present the status of DPM Readiness verification exercise at the 20140724 WLCG Ops Coord Meeting. Done. Report HERE.
  • 20140702-03 David C. (ATLAS) Clarify the ATLAS position on the CVMFS use and the exact location for clients’ candidate releases. Done. Documented HERE.
  • 20140702-02 Joel (LHCb) Discuss in the LHCb collaboration and document in their workflow page, linked from the WG twiki, if and which sites will participate in the DPM Readiness verification exercise. Done. LHCb will only be involved with VOMS client verification for now.
  • 20140702-01 Andrea S. (CMS) & David C.(ATLAS) Decide internally if USATLAS or USCMS can take ownership of HTCondor new versions’ validation, via test instances of pilot factories., also validating against CREAM and ARC CEs. Done. See the HTCondor row in the Product Table.

 

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2018-02-28 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback