DPM upgrade task force (Second circle)

Introduction

WLCG Operations is starting the second upgrade circle of the DPM sites. The goal of this circle is to upgrade DPM instances to the version which is supporting Third Party Copy (TPC) and in which most of the issues detected during TPC tests have been fixed. Version to which DPM instances should be upgraded is 1.14. Apart of upgrade to DPM 1.14, those sites which did note yet enabled DOME have to enable DOME as well as SRR generation and macaroons.

Mandate of the task force

  • Coordinate the upgrade of the DPM sites to DPM version 1.14 or higher and reconfiguration required to enable DOME, SRR and macaroons.
  • Provide guidance and support sites for upgrade and reconfiguration
  • Validate SRRs published by DPM sites and make sure that they can be integrated with CRIC and the WLCG Storage Space Accounting system

Sites requiring upgrade to 1.14 and reconfiguration for DOME

Site DPM Version (27.08.2020) Upgrade is planned (date) Comments GGUS ticket Contacts
UKI-SCOTGRID-GLASGOW [u'1.8.10']   Can considered to be DONE, data migrated away from DPM. Unless it is decomissioned , ticket stays opened. https://ggus.eu/index.php?mode=ticket_info&ticket_id=148396 uki-scotgrid-glasgow@SPAMNOTphysicsNOSPAMPLEASE.gla.ac.uk
ICM [u'1.10.0']   DONE. Ticket is closed, site does not provide storage any more https://ggus.eu/index.php?mode=ticket_info&ticket_id=148391 plgrid-admins@SPAMNOTicmNOSPAMPLEASE.edu.pl
TW-NCUHEP [u'1.14.2']   Upgrade is done, no SRR yet, https://ggus.eu/index.php?mode=ticket_info&ticket_id=148395 cmkuo@SPAMNOTmailNOSPAMPLEASE.cern.ch
TR-03-METU [u'1.14.2'] DONE   https://ggus.eu/index.php?mode=ticket_info&ticket_id=148394 grid@SPAMNOTulakbimNOSPAMPLEASE.gov.tr
PSNC [u'1.13.0']   DONE. Storage is decomissioned https://ggus.eu/index.php?mode=ticket_info&ticket_id=148393 egee@SPAMNOTmanNOSPAMPLEASE.poznan.pl
NCP-LCG2 [u'1.13.0']     https://ggus.eu/index.php?mode=ticket_info&ticket_id=148392 fsaeed@SPAMNOTcernNOSPAMPLEASE.ch
GR-07-UOI-HEPLAB [u'1.13.0']     https://ggus.eu/index.php?mode=ticket_info&ticket_id=148389 grid@SPAMNOTalphaNOSPAMPLEASE.physics.uoi.gr
HK-LCG2 [u'1.14.2']   DONE https://ggus.eu/index.php?mode=ticket_info&ticket_id=148390 grid-prod@SPAMNOTatlasNOSPAMPLEASE.cuhk.edu.hk
ru-PNPI ?     Just a mail, since the site is not visible in GGUS
IR-IPM-HEP [u'1.14.2 with DOME and Macaroons']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148446 grid-hep@SPAMNOTipmNOSPAMPLEASE.ir
UKI-SOUTHGRID-BRIS-HEP 1.9.0 Plan is to RETIRE DPM(dmlite) in favor of xrootd-SE (in progress) delay due to ++++staff shortage https://ggus.eu/index.php?mode=ticket_info&ticket_id=153106 lcg-admin@SPAMNOTbristolNOSPAMPLEASE.ac.uk

Sites requiring upgrade to 1.14.*

Site DPM Version (27.08.2020) Upgrade is planned (date) Comments GGUS ticket Contacts
IN2P3 -IRES [u'1.14.2 with DOME and macaroons']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148437 grid.admin@SPAMNOTiphcNOSPAMPLEASE.cnrs.fr
GRIF [u'1.14.2 with DOME and macaroons']   DONE for IRFU, LAL and LPHE upgrade to 1.14.2 https://ggus.eu/?mode=ticket_info&ticket_id=148434 grid.admin@SPAMNOTgrifNOSPAMPLEASE.fr
UKI-SCOTGRID-ECDF [u'1.14.2 with DOME'] End of October DONE Upgrade completed and Macaroons enabled https://ggus.eu/?mode=ticket_info&ticket_id=148465 wlcg-support-ecdf@SPAMNOTmlistNOSPAMPLEASE.is.ed.ac.uk
UKI-SOUTHGRID-OX-HEP [u'1.13.0 with DOME']   DONE. DPM decomissioned https://ggus.eu/?mode=ticket_info&ticket_id=148466 lcg_manager@SPAMNOTphysicsNOSPAMPLEASE.ox.ac.uk
INFN-COSENZA [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148442 recas.alarm@SPAMNOTgmailNOSPAMPLEASE.com
TW-NTU-HEP [u'1.14.2 with DOME'] DONE   https://ggus.eu/?mode=ticket_info&ticket_id=148459 sysadmin@SPAMNOThep1NOSPAMPLEASE.phys.ntu.edu.tw
UNIBE-LHEP [u'1.14.2 with DOME']   DONE. https://ggus.eu/?mode=ticket_info&ticket_id=148467 it-ops@SPAMNOTlhepNOSPAMPLEASE.unibe.ch
UKI-NORTHGRID-LANCS-HEP [u'1.14.2 with DOME'] Done DONE. https://ggus.eu/?mode=ticket_info&ticket_id=148462 lcg-admin@SPAMNOTlancsNOSPAMPLEASE.ac.uk
BUDAPEST [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148430 gridadm@SPAMNOTrmkiNOSPAMPLEASE.kfki.hu
UKI-NORTHGRID-MAN-HEP [u'1.14.2 with DOME', u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148463 man-tier2-helpdesk@SPAMNOTcernNOSPAMPLEASE.ch
Kharkov-KIPT-LCG2 [u'1.14.2 with DOME'] end of October DONE https://ggus.eu/?mode=ticket_info&ticket_id=148447 grid_support@SPAMNOTkiptNOSPAMPLEASE.kharkov.ua
IN2P3 -LPC [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148439 grid-admin@SPAMNOTclermontNOSPAMPLEASE.in2p3.fr
IN2P3 -LPSC [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148440 grid.admin@SPAMNOTlpscNOSPAMPLEASE.in2p3.fr
UKI-LT2-Brunel [u'1.14.2 with DOME', u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148460 lcg-admin@SPAMNOTbrunelNOSPAMPLEASE.ac.uk;rlopes@SPAMNOTcern.ch;daniela.bauer.grid@SPAMNOTgooglemail.com;
BEIJING-LCG2 [u'1.14.2 with DOME', u'1.14.2 With DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148429 lcg-admin@SPAMNOTihepNOSPAMPLEASE.ac.cn
UKI-SCOTGRID-DURHAM [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148464 oper.ip3@SPAMNOTdurhamNOSPAMPLEASE.ac.uk
FMPhI -UNIBA [u'1.14.2 with DOME'] by end of Nov DONE https://ggus.eu/?mode=ticket_info&ticket_id=148432 gridmaster@SPAMNOTdnpNOSPAMPLEASE.fmph.uniba.sk
INFN-ROMA1 [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148445 grid-prod@SPAMNOTroma1NOSPAMPLEASE.infn.it
UKI-NORTHGRID-LIV-HEP [u'1.14.2 with DOME', u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148470 gridteam@SPAMNOThepNOSPAMPLEASE.ph.liv.ac.uk
TR-10-ULAKBIM [u'1.14.0 with DOME', End of September DONE. Upgrade completed and Macaroons enabled, need to upgrade to 1.14.2 https://ggus.eu/?mode=ticket_info&ticket_id=14845 grid@SPAMNOTulakbimNOSPAMPLEASE.gov.tr
CBPF [u'1.14.2 with DOME']   DONE , should be checked by Alessandra https://ggus.eu/?mode=ticket_info&ticket_id=148469 gridlafex@SPAMNOTcbpfNOSPAMPLEASE.br
INFN-FRASCATI [u'1.14.2 with DOME', u'1.13.0 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148443 grid-prod@SPAMNOTlnfNOSPAMPLEASE.infn.it
TOKYO-LCG2 [u'1.14.2 with DOME'] by end of October Done https://ggus.eu/?mode=ticket_info&ticket_id=148454 lcg-admin@SPAMNOTiceppNOSPAMPLEASE.s.u-tokyo.ac.jp
IN2P3 -LAPP [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148438 support-grid@SPAMNOTlappNOSPAMPLEASE.in2p3.fr
TW-NCHC [u'1.14.2 with DOME+legacy']   DONE , needs to be checked by experiments https://ggus.eu/?mode=ticket_info&ticket_id=148457 lincy@SPAMNOTnchcNOSPAMPLEASE.org.tw
CYFRONET-LCG2 [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148431 support@SPAMNOTgridNOSPAMPLEASE.cyfronet.pl
INFN-NAPOLI-ATLAS [u'1.14.2 with DOME end Macaroons'] End of October DONE https://ggus.eu/?mode=ticket_info&ticket_id=148444 grid-prod-atlas-mon@SPAMNOTnaNOSPAMPLEASE.infn.it
Australia-ATLAS [None, u'1.13.1 with DOME']     https://ggus.eu/?mode=ticket_info&ticket_id=148428 coepp-sysadmin@SPAMNOTlistsNOSPAMPLEASE.unimelb.edu.au
RO-07-NIPNE [u'1.14.2 with DOME', u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148452 ciubancan@SPAMNOTnipneNOSPAMPLEASE.ro
INDIACMS-TIFR [None, u'1.14.2 WITH DOME'] Claimed to be done, was not checked properly by experiments, SRR is not accessible   https://ggus.eu/?mode=ticket_info&ticket_id=148441 brij.jashal@SPAMNOTtifrNOSPAMPLEASE.res.in
ZA-WITS-CORE [u'1.14.2 with DOME']   DONE 29 Nov https://ggus.eu/?mode=ticket_info&ticket_id=148468 scott.hazelhurst@SPAMNOTwitsNOSPAMPLEASE.ac.za
Taiwan-LCG2 [u'1.14.2 with DOME'] 2020-09-28 DONE https://ggus.eu/?mode=ticket_info&ticket_id=148453 ops@SPAMNOTlistsNOSPAMPLEASE.grid.sinica.edu.tw
UKI-LT2-RHUL [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148461 S.George@SPAMNOTrhulNOSPAMPLEASE.ac.uk;antonio.perezfernandez@SPAMNOTrhul.ac.uk;grid-admin@SPAMNOTpp.rhul.ac.uk;duncan.rand@SPAMNOTimperial.ac.uk
praguelcg2 [u'1.15.0 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148450 ngi-firstline@SPAMNOTmetacentrumNOSPAMPLEASE.cz
NCBJ-CIS [u'1.14.2 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148448 admins@SPAMNOTcisNOSPAMPLEASE.gov.pl
IN2P3 -CPPM [u'1.14.2 with DOME', u'1.14.0 with DOME']   DONE https://ggus.eu/?mode=ticket_info&ticket_id=148436 gridadmin@SPAMNOTcppmNOSPAMPLEASE.in2p3.fr

Upgrade instructions

Test that storage works after upgrade

for example for Cosenza ./smoke-test.sh https://recas-se-01.cs.infn.it:443/dpm/cs.infn.it/home/dteam

Recommended configuration

  • ATLAS (Rucio)
    • Use at least DOME DPM 1.14 + XRootD 4.12.4 + davix 0.7.6 ... latest (stable) versions from EPEL recommended
      • dmlite is linked with xrootd packages available at release date and by moving to the latest dmlite it is necessary to use the most recent xrootd packages
      • August 2020: xrootd R4 is supported. Xrootd R5 is not supported.
      • enable GridFTP redirection: puppet head+disknode configuration option gridftp_redirect (enabled by default since DPM 1.14)
      • enable XRootD checksums: puppet head+disknode configuration option configure_dpm_xrootd_checksum (enabled by default since DPM 1.13)
      • optionally enable TPC XRootD delegation: puppet disknode configuration option configure_dpm_xrootd_delegation (enabled by default since DPM 1.13)
      • to support IPv4 only clients with enabled IPV6 in GridFTP plugin (was default in gfal-2.17.0, but reverted back to IPv4 in following releases) on dualstack DPM epsv_match must be enabled, see LCGDM-2817
    • AGIS configuration (example for SE, panda)
      • GridFTP is still preferred protocol with priority 0 for tpc activities (requires GridFTP redirection)
        • during Autumn/Winter 2020 we'll gradually move ATLAS sites to HTTP-TPC ( ADCINFR-166)
      • XRootD for lan and wan read+write (write works only with XRootD checksums enabled)
      • rucio mover for panda queues (rucio mover use storage protocols according preferences defined in AGIS)
      • each protocol in AGIS should have monitoring enabled to be part of ATLAS SAM tests / ETF check_mk
      • EGI sites should also register each SE protocol with GOCDB (example: SRM, GridFTP, XRootD, WebDAV)
        • GOCDB storage protocols are used to automatically postpone ATLAS transfers when site storage is in downtime
      • fully SRM-less operation requires additional configuration of the Storage Resource Reporting (SRR)
    • Configure Argus banning available since DOME DPM 1.14.0
  • CMS (PhEDEx)
  • Dirac users
    • internally use GFAL for transfers (unless you still use deprecated protocols)
    • if your DPM supports IPv4 + IPv6 be avare IPv4 only clients can't access data using gsiftp protocol unless you follow instruction in LCGDM-2817
    • LFC catalog
      • deprecated & EOL - you should think about migration
      • full file URL stored in catalog - can't easily switch from SRM protocol
    • DFC catalog
      • possible to configure non-SRM transfer protocols
      • with GridFTP redirection enabled in DPM it should be almost transparent switching from GFAL2_SRM2 to GFAL2_GSIFTP

After reconfiguration for DOME make sure that SRR is enabled

How to enable SRR

After changes performed on your service, please, update information in CRIC

Authentication & authorization step

  • Go to WLCG cric server , click "Site topology" (menu on the top of the page) -> Services. Enable filtering, by clicking on the 'Filter' button and select your site. By default , you won't see implementation and implementation version columns in the table. In order to see this info, you need to click on 'Columns' and then select corresponding columns in the drop down list.

  • You should be able to list all CRIC entities (sites (GocDB /OIM and experiment-specific ones), federations, pledges, services, storage protocols and queues) without authentication. However, once you would like to see details of any particular entity, you would be asked to login.

  • Those who are registered in the CERN DB, please, use SSO authentication. Authentication with certificate is not yet enabled on this instance, will come soon.

  • Those who are not registered in the CERN DB would need to ask for CRIC local account. Please, send a mail to cric-devs@cernNOSPAMPLEASE.ch with your name, family name and mail address to be used by CRIC to communicate with you.

  • As soon as you are logged in, you will be able to see details of any CRIC entity, however in order to edit in order to edit information, one would need to get specific privileges. * As soon as you are authenticated, you will see 'Request privileges' on the top of the page next to your login name. Please, click on it and follow up the request procedure which allows to request global admin privileges, site admin privileges or federation admin privileges. Ask for sites admin privileges for your site. You will be shortly informed that your privileges are enabled. Please re-login.

Editing storage info

  • Once you login with appropriate privileges, you should be able to edit information about your site. At the moment we are particularly interested in storage info at your site, namely its implementation, implementation version and SRR URL when it enabled.

  • CRIC creates virtual storage service per site/per VO/per media/per implementation. By default it creates 1 disk and 1 tape virtual storage for every VO which is served by a given T1 site. However, if for a given VO there are storage instances for the same media but different implementation (for example EOS and dCache instances for disk storage for ATLAS), CRIC should create two different disk virtual storage instances for this VO. Unfortunately, for the moment, there is no reliable primary source for this kind of information, so it is highly likely that only a single virtual storage will be created by CRIC in such cases. Would be great if you could correct it using CRIC UI and add other storage virtual instances with their implementation , implementation versions for your site and SRR URL when it is enabled. In the future we hope to get this information through SRR (Storage Resource Reporting).

  • In the service table view, click on a particular service name
  • You get a form with detailed information about service
  • Click on the 'Edit' button under the first block of information
  • You get another form. Please, correct 'Version' of your DPM implementation. In case Dome is enabled, please provide version number complemented 'with DOME' and provide "Resource Reporting URL" value
  • Click on 'Check input data' and save info

Creating a new virtual storage instance in CRIC

  • Staring from the entry page: https://wlcg-cric.cern.ch/
    • in the block "Site & Services" click 'Create Storage Service'. You get a form to fill in
  • Keep service name field empty as the form suggests
  • Select your site form the drop down menu
  • Service type (SE) should not be touched
  • Select Disk or Tape media in the "Architecture" filed drop down menu
  • Provide value for implementation (EOS, Castor, Xrootd, dCache, DPM)
  • Provide value for implementation version
  • You can provide a value in the endpoint field or leave it empty if it does not make sense
  • Please, select value for the VO name. As mentioned above, the virtual storage in CRIC is created for a single VO even though several VOs can share the same physical storage service of the site
  • Leave 'ACTIVE' object state
  • All other attributes are optional, you can leave them empty

Creating a new protocol for a given virtual storage in CRIC

  • Currently, even if there is one single protocol shared by several virtual storage instance in CRIC, for each
virtual storage instance a new protocol instance has to be created
  • To create new protocol for a given virtual storage instance, select corresponding service in the service list and click on the name to get a detailed description of the virtual storage service
  • Below the table with the list of protocols, click on the 'Add protocol' button. You will get a form to fill.
  • Leave the name of the protocol empty, the system will generate it for you
  • "Flavour" and "endpoint" are mandatory attributes, other fields could be empty

Deleting virtual storage instance from CRIC

For the time being , deletion from the UI is not allowed. Change the object state to "Disabled" in order to make it disappear from the listing

Deleting protocols from CRIC

The protocol attached to a particular virtual storage can be deleted from the protocol list from the detailed page describing the virtual storage service.

DPM service monitoring for EGI.

In order to enable DPM service monitoring for EGI one needs to configure webdav (HTTPS) for the ops VO and register the endpoint on GOCDB

Registration of webdav service endpoint in GocDB

For registering on GOC-DB the webdav service endpoint, follow the HOWTO21 in order to fill in the proper information.

In particular:

Enable gridftp monitoring for ops VO (if you provide such protocol)

  • register a new service endpoint, associating the storage element hostname to the service type “globus-GRIDFTP”, with the "production" flag disabled;
  • in the “Extension Properties” section of the service endpoint page, fill in the following fields:
  • Name: SE_PATH
  • Value: /dpm/ui.savba.sk/home/ops #this is an example, set the proper path
  • check if the tests are ok (it might take some hours for detecting the new service endpoint) and then switch the production flag to "yes"

Detected problems

Useful links

Participants

  • Fabrizio Furano (DPM)
  • Oliver Keeble (DPM and WLCG Steering Group)
  • Dimitrios Christidis (WLCg Storage Space Accounting)
  • Julia Andreeva (WLCG Operations Coordination)

First Upgrade Circle

Edit | Attach | Watch | Print version | History: r214 < r213 < r212 < r211 < r210 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r214 - 2021-09-02 - JuliaAndreeva
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback