Globus retirement considerations in WLCG

Introduction

In the WLCG Management Board meeting of March 17, 2020, there was a discussion about a possible retirement timeline for the remaining WLCG dependencies on Globus:

Prompted by the plans and timelines that already exist in OSG:

For WLCG, removing the dependency on GridFTP is being tackled in the TPC WG of the DOMA project:

Currently, GridFTP is also being used for job submissions to CREAM instances, which have almost all been decommissioned by now, and ARC CE instances, which already support HTTPS as an alternative.

While it looks viable for WLCG not to depend on GridFTP by the end of 2021, can we actually remove Globus as a build dependency from the various storage service implementations (see below) that currently make use of it?

Also taking into account that other communities may be unable to replace X509 certificates with tokens as "quickly" as WLCG hopes to do, implying there may need to remain code in the affected implementations that is able to deal with X509 somehow. Such code may depend on Globus GSI today.

Furthermore, the transitional X509 + VOMS support in the new WLCG AAI (for services or workflows that do not support WLCG tokens yet) may rely on the RCauth online CA, which makes use of a MyProxy server to communicate with its HSM.

More on GSI: besides its use in conjunction with GridFTP and SRM, WLCG already has a critical dependency on it through MyProxy. For as long as we need the latter, it may not be a big deal to support GSI in addition, though in theory, MyProxy could be made independent of it.

Another very important dependency on GSI comes through GSI-OpenSSH, which is used by VOBOX instances to give login access to privileged members of supported VOs.

WG for Transition to Tokens and Globus Retirement

The WG for Transition to Tokens and Globus Retirement has been mandated to coordinate the many parties involved in the transition from X509 to WLCG tokens and the gradual phaseout of dependencies on Globus.

Current support

The remaining relevant parts of Globus and a few related products are currently being maintained by the Grid Community Forum, with contributions from several partners:

  • EGI, particularly NIKHEF
  • OSG, particularly U Wisconsin-Madison for builds and releases
  • NorduGrid, particularly Uppsala
  • HPC Center Stuttgart
  • ...

Currently supported components:

  • GSI
    • To deal with X509 in many products
    • A critical dependency of the other components
  • GridFTP
    • Currently the protocol is the main workhorse for data transfers
    • The DOMA TPC WG foresees to have it replaced by WebDAV in the course of 2021
    • Also used for job submissions to ARC (and CREAM)
  • MyProxy
    • The standard solution for storing long-lived proxies
  • GSI-OpenSSH
    • SSH extension supporting X509 proxies for authentication
    • Critical for access to WLCG VOBOX instances
  • UberFTP
    • A handy GridFTP client tool
    • In WLCG only used via CREAM, as far as we know

Development aspects

Most of the code is essentially stable. Some fixes may be needed for bugs that e.g. may be encountered on CentOS 8. The biggest concern probably was the postponed support for TLS v1.3, which has become available since 3 September 2021. However, if we were forced to start using that version, further debugging effort might be required from experts in GSI code and/or other MW involved.

Products that are not affected

  • dCache
  • XRootD
  • EOS
  • Echo
  • StoRM (webdav/https globus independent)
  • DPM (webdav/https globus independent)

Affected products

Grid Community Forum products

  • Globus GridFTP server & client
  • GSI libraries
  • GSI-OpenSSH
  • MyProxy server & client
  • UberFTP

Argus

  • Indirect dependency on GSI through the argus-gsi-pep-callout plugin
    that is typically used by CE services to refer to Argus
    • Such call-outs serve components that themselves depend on GSI

ARC CE

Info provided by Balazs Konya:

  • Once the gridftp-jobplugin job submission is dropped (e.g. either the
    EMI-ES or the REST interface is used) and third party storage servers
    do not require GridFTP, ARC is Globus-free.
  • Nothing in ARC pulls Globus dependencies by default,
    all Globus dependencies are separated in modular packaging.

DPM

Info provided by Fabrizio Furano:

  • Globus is expected to be ejectable with a little tinkering.
  • For HTTP it uses a library called libgridsite, which does not depend on Globus.

FTS

Info provided by Mihai Patrascoiu:

  • FTS has the following as a build dependency:
    • globus-gass-copy-devel
  • Regarding gfal2, of course, the dependencies revolve around the SRM and GridFTP plugins:
    • srm-ifce-devel
    • globus-gass-copy-devel

GFAL2

  • Only the plugin libraries for GridFTP and SRM depend on GSI.

HTCondor

  • X509 proxy support depends on GSI.
  • HTCondor-G job submission to ARC CEs currently depends on GridFTP and GSI.
  • HTCondor CE authorization often depends on Globus call-outs to LCMAPS or Argus.

LCMAPS

Info provided by Mischa Sallé:

  • A (potential) dependency of various relevant products:
    • ARC, HTCondor CE (and CREAM)
      • E.g. for VOMS mappings or GSI call-outs to Argus
    • Globus GridFTP server, for VOMS mappings using GSI call-out (i.e. the lcas-lcmaps-gt4-interface)
      • Used in that way by StoRM
    • GSI-OpenSSH, ditto
      • Not on the WLCG VOBOX yet (currently edg-mkgridmap is still used instead)
    • XRootD plugin
      • At least OSG, (US)ATLAS, (US)CMS and GridPP appear to depend on it
    • Other products?

  • LCMAPS (minus its Globus interface) could be made independent of GSI
    • Note: the Globus interface is probably only used by the lcas-lcmaps-gt4-interface which itself depends on Globus

StoRM

Info provided by Andrea Ceccanti:

  • The StoRM SRM frontend depends on CGSI-gSOAP and Globus.
  • We are working on an alternative that won't require those dependencies.
  • Not sure we will be ready by the end of 2021, though....
  • The StoRM WebDAV service does not depend on Globus at all.
  • We are going to keep SRM, that will be needed for tape interaction at CNAF.

WLCG VOBOX

  • GSI-OpenSSH server + client
  • MyProxy client

Others

Affected experiments

ALICE

  • No (recent) dependencies on Globus in offline frameworks.
  • Critical VOBOX dependencies on MyProxy and GSI-OpenSSH.

ATLAS

  • PanDA:
    • The Proxy Cache mechanism requires a credential uploaded to CERN’s myproxy server and fetched by the panda server using myproxy client (link to Harvester documentation)
    • X509 authentication via mod_gridsite
  • Harvester:
    • ARC-CE submission
      • ARC supports a GridFTP interface built on Globus and an HTTPS interface based on SOAP/XML
      • Most sites still use the GridFTP interface
      • aCT can submit to the HTTPS interface as well as GridFTP
      • HTCondor-G currently can only submit to GridFTP interface
      • Medium-term plan is for ARC to develop a new REST HTTP interface and HTCondor-G to develop a REST client. ETA mid 2021.
    • HTCondor-CE submission
      • Version 4+ supports WLCG token/SciToken and GSI based authentication
      • Harvester to HTCondor-CE submission demo: ATLASPANDA-505
  • Rucio:
    • No direct dependency on Globus
    • Nota bene
      • X509 authentication relies on mod_gridsite and voms-proxy-init for VOMS
      • Is there a dependency for those on Globus libraries? I didn't see any?
        • Those do not depend on Globus
    • Dependency on GFAL for clients, which includes a Globus plugin for gsiftp/srm support
    • GlobusOnline support is an optional transfer tool and requires Globus SDK
    • Tape access:
      • replacing tape endpoint SRM/GridFTP out of scope of WLCG DOMA TPC working group
      • dCache developers aware of disappearing SRM/GridFTP but no concrete plans yet
      • After some development, WLCG DOMA TPC is replacing SRM/GridFTP with SRM/https at dcache and storm sites.
      • TAPE REST API subgroup for longer term common solution to tape access

  • Tools:
    • arcproxy: no dependency on Globus

CMS

  • SRMv2, GridFTP, gsiftp is used for data transfer (we are in the process of switching to different protocols and will have a better view end of the year 2020)
  • X509 certificate authentication is used in many places
    • Global Pool (i.e. all job submissions for production and analysis), SAM, GGUS, etc. (there is work ongoing on CMS side; we'll have a better view after capability tokens are in common use, next year)
    • CRAB (tool for submission of user jobs to the Grid) depends on MyProxy in order to be able to submit jobs (including allowing proxy renewal for running jobs by HTCondor) and to move files with user credentials to user home directories at various T2's. Generally speaking MyProxy can only be decommissioned after all services stop using x509 authentication.
    • CRAB submission from central server to schedd's uses X509-authenticated HTCondor APIs.
      • CRAB will be some work - we'll likely have to carry around both tokens and proxies for a while here.
    • Several production services inside CMSWEB also rely on MyProxy to periodically refresh credentials for internal and external communication.
    • CMS currently uses Rucio via X509 and all that ATLAS wrote on that applies here as well
    • The glideinWMS 3.7.1 will have the ability for all the internal communications on the global pool to be done via tokens instead of GSI. This will allow the global pool to be upgraded to be GSI-free.
      • For submitting to CEs, HTCondor is following the progress in ARC toward non-GSI submission. HTCondor-CE already works.
    • The Workload Management eco-system (including WMAgent) relies on X509 certificates to authenticate to other central services (mostly CMSWEB), so it would need to be adapted to work with tokens.
    • WMAgent job submission - via condor - also makes use of X509 certificates and VOMS roles attached to it (in order to access specific storage paths), so tokens would need to be adopted as well.
      • A first prerequisite is to enable this on the HTCondor side and start working to submit jobs with tokens from there.

CMS intends to follow the published schedule. It would be useful to ask middleware developers to provide roadmaps on their progress similar to what OSG has done.

LHCb

  • DIRAC by itself does not depend on globus, but the middleware dependencies that DIRAC use do (those middlewares are also in this same wiki entry).
  • DIRAC does not depend on MyProxy

Other communities

WLCG sites need to support other communities that may not all be able to move to tokens on the same timescale as WLCG. That could imply that some effort would need to be found, possibly coordinated with EGI, to maintain at least the GSI-related components, assuming that SRM and GridFTP can already be replaced with WebDAV and/or Xrootd, as both are supported by GFAL2.

In principle X509 proxies do not imply the use of GSI, as there are other libraries (e.g. GridSite) that support X509, but in practice there may be hard dependencies on GSI for X509 in other SW that is vital to other communities. In fact, even some of the LHC experiments might be facing such issues and have to re-implement the handling of X509 in some of their SW, unless GSI remained supported for a number of years more.

External documents

Edit | Attach | Watch | Print version | History: r28 < r27 < r26 < r25 < r24 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r28 - 2021-09-19 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback