Uni Bonn configuration
Production setup
- Offering spacetoken UNI-BONN_LOCALGROUPDISK
- 1 xrootd redirector (for load balancing, redirecting only to online DTNs etc.) offering both xroot and WebDAV protocols, all via native xrootd.
- 8 DTNs (with 1 GBit/s outbound connectivity each), all running xrootd offering the same protocols, and handling authentication.
- All 8 DTNs are TPC-enabled for xrootd via a robot certificate and by macaroon support.
- We are running the latest xrootd (5.0.0).
- Filesystem behind this is CephFS, i.e. a standard POSIX filesystem with xattr support. xrootd makes good use of that to record the checksums.
- Site-local scratch directory for VOMS-authenticated users to use for copying data in and out. This is purged every 7 days.
- We use xrootd-voms, not lcmaps.
- We support multiple VOs: atlas, ops, dteam, wlcg, all with their separate space.
- OS is CentOS 7, but we also tested this on CentOS 8. Since the WLCG repository is still CentOS 7 only, you'll need to set
force_wlcgrepo_centos7
in gridcert.pp
when using CentOS 8, which still works since we only rely on the noarch VOMS config packages from there.
- All configuration is handled via Puppet.
- TLS is not enforced yet, only used for capable clients opportunistically.
Configuration Management via Puppet
All components of our site are deployed with
Foreman and configured via
Puppet.
We re-use existing Puppet modules whenever possible. All Puppet modules we use are forked to
https://github.com/unibonn/ . For any feature / bugfix, we send a pull request to the corresponding upstream project after testing the modification in production.
More details on the Puppet modules and classes of relevance for the
XRootD setup below.
Used Puppet modules
Developed Profile Classes
-
xrootd.pp
- For the actual xrootd and cms setup, robot cetificate deployment, xrdcp wrapper deployment (logs to
/var/log/xrootd/grid/xrdcp-voms.log
, logrotated), macaroon shared secret deployment. This needs to be applied to any manager and DTN node. Also runs maintenance cronjobs, see below. Reports expiry of robot certificate to Zabbix.
-
xrootd_user.pp
- For the setup of the xrootd user account with a fixed UID/GID. This needs to be applied to any node mounting the POSIX filesystem.
-
xrootd_scratch.pp
- For the handling of the site-local scratch directory.
-
gridcert.pp
- To deploy the host certificates. Reports expiry of certificates to Zabbix.
-
griddarkdata.pp
- Produces the monthly dark data report.
-
gridspaceusage.pp
- Produces the JSON for space usage reporting (see also: Rucio space reporting documentation) and an SRR JSON (see also: LCG/StorageSpaceAccounting) The script we use naturally has some CephFS specifics, but should be easily adaptable to most POSIX FS. With Zabbix integration.
-
gridmonitoring.pp
- Checks status of a spacetoken in AGIS and let's you know by mail when the status is changed.
Required extra files
- You need to add a robot certificate registered with your VO, which will be used as fallback by the server for Third-Party-Copy when no certificate is delegated. Beware that this effectively means the credential can be used for copies by any user allowed to access the xrootd servers!
- The puppet profile code expects the public certificate in
files/xrootd/robotcert.pem
and the private key in files/xrootd/robotkey.pem
- You need a shared secret between all servers for issuing and validating macaroons. This can be created by:
-
openssl rand -base64 -out macaroon-secret 64
- This file is expected to be in
files/xrootd/macaroon-secret
for the puppet profile code to deploy it.
Maintenance Cronjobs
Maintenance cronjobs are installed by
xrootd.pp
if the smart class parameter
maintenance_cronjobs
is enabled. These cronjobs should usually only run on one node, and take care of the following tasks:
- Fixing permissions for downloaded files (Cronjob should not change anything, since we deploy and hook up an eventstream listener fixing permissions for all files!). This can be toggled separately via the
permission_fix_cronjob
profile class parameter.
- Purging empty directories.
Additionally, there is one cronjob running on all nodes:
--
AlessandraForti - 2018-09-26