How to set up an HTCondor CE service on a single host

The host should at least have 4 GB of RAM and 10 GB of disk space for simple tests, whereas more memory and/or disk may be needed for realistic jobs.

First set up a mini HTCondor service following the Admin Quick Start Guide:

https://research.cs.wisc.edu/htcondor/htcondor/documentation/

The Long Term Support (LTS) Channel (see below) concerns v9.0.x whose EOL will be Feb 2023: it supports X509 proxies for authentication and delegation, whereas the releases in the Feature Channel only support the latter purpose, i.e. equipping jobs with such proxies. Both channels support SciTokens for authentication. In the course of 2022 we will need to make job submission with tokens work for HTCondor CEs across the infrastructure. Example configurations are shown below.

Notes on using the Long Term Support Channel

Note: the Admin Quick Start Guide defaults to the Feature Channel.
To deploy the LTS a.k.a. stable release, one can imitate these steps,
which also prevent a fatal error encountered on CC7 hosts:

----------------------------------------------------------------------
yum remove epel-release-7
----------------------------------------------------------------------
curl -fsSL https://get.htcondor.org | \
GET_HTCONDOR_PASSWORD=<pick-a-password> \
/bin/bash -s -- --no-dry-run --channel stable
----------------------------------------------------------------------

If that worked as expected, your host is already running an HTCondor batch service now.

Setting up the CE interface

For its CE interface, ensure the host has a certificate, the CAs and the desired VOMS configuration details. The IGTF Certificate Authorities can be installed from the ca-policy-egi-core rpm available from the EGI CA repository. The VOMS details for WLCG VOs can be installed from wlcg-voms-* rpms available from the WLCG rpm repository.

For example:

----------------------------------------------------------------------
(cd /etc/yum.repos.d/ && curl -O https://repository.egi.eu/sw/production/cas/1/current/repo-files/EGI-trustanchors.repo)
----------------------------------------------------------------------
yum install ca-policy-egi-core
----------------------------------------------------------------------
yum install http://linuxsoft.cern.ch/wlcg/centos7/x86_64/wlcg-repo-1.0.0-1.el7.noarch.rpm
----------------------------------------------------------------------
yum install wlcg-voms-{alice,lhcb,dteam}
----------------------------------------------------------------------

Ensure a valid host certificate has been installed:

----------------------------------------------------------------------
openssl x509 -noout -dates -in /etc/grid-security/hostcert.pem
----------------------------------------------------------------------

The directories in question should resemble what is shown here:

----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/grid-security/
total 76
drwxr-xr-x. 2 root root 40960 May 18 18:48 certificates
-rw-r--r--. 1 root root  3198 Mar 13 04:07 gsi.conf
-r--r--r--. 1 root root  3060 May 18 15:18 hostcert.pem
-r--------. 1 root root  1828 May 18 15:18 hostkey.pem
drwxr-xr-x. 5 root root    44 May 18 15:08 vomsdir
----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/grid-security/vomsdir/
total 0
drwxr-xr-x. 2 root root 60 May 18 15:08 alice
drwxr-xr-x. 2 root root 37 May 18 15:08 dteam
drwxr-xr-x. 2 root root 60 May 18 15:08 lhcb
----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/grid-security/vomsdir/alice/
total 8
-rw-r--r--. 1 root root 101 Feb 11  2014 lcg-voms2.cern.ch.lsc
-rw-r--r--. 1 root root  97 Feb 11  2014 voms2.cern.ch.lsc
----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/grid-security/vomsdir/lhcb/
total 8
-rw-r--r--. 1 root root 101 Feb 11  2014 lcg-voms2.cern.ch.lsc
-rw-r--r--. 1 root root  97 Feb 11  2014 voms2.cern.ch.lsc
----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/grid-security/vomsdir/dteam/
total 4
-rw-r--r--. 1 root root 129 Jan 19  2017 voms2.hellasgrid.gr.lsc
----------------------------------------------------------------------

Ensure the CRLs are up to date:

----------------------------------------------------------------------
yum install fetch-crl
----------------------------------------------------------------------
systemctl enable fetch-crl-cron
----------------------------------------------------------------------
systemctl start fetch-crl-cron
----------------------------------------------------------------------
fetch-crl > /tmp/crl-$$.log 2>&1 < /dev/null &
----------------------------------------------------------------------

We will now set up the HTCondor CE following these steps:

https://htcondor.com/htcondor-ce/v5/installation/htcondor-ce/

First:

----------------------------------------------------------------------
yum install htcondor-ce-condor
----------------------------------------------------------------------

Copy the pool password:

----------------------------------------------------------------------
cp /etc/condor/passwords.d/POOL /etc/condor-ce/passwords.d/
----------------------------------------------------------------------

Open the HTCondor CE port:

----------------------------------------------------------------------
firewall-cmd --permanent --zone=public --add-port=9619/tcp
----------------------------------------------------------------------
firewall-cmd --reload
----------------------------------------------------------------------

The HTCondor CE daemon configuration should resemble the following (edit the files as indicated):

----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/condor-ce/config.d/
total 24
-rw-r--r--. 1 root root 1321 May 29 02:05 01-ce-auth.conf
-rw-r--r--. 1 root root 1714 Dec 21 22:11 01-ce-router.conf
-rw-r--r--. 1 root root 1362 Dec 21 22:11 01-pilot-env.conf
-rw-r--r--. 1 root root 1444 Dec 21 22:11 02-ce-condor.conf
-rw-r--r--. 1 root root  500 Dec 21 22:11 03-managed-fork.conf
-rw-r--r--. 1 root root   41 May 29 02:17 50-schedd2.conf
----------------------------------------------------------------------
[root@mini-htc ~]# grep ^AUTH /etc/condor-ce/config.d/01-ce-auth.conf 
AUTH_SSL_SERVER_CERTFILE = /etc/grid-security/hostcert.pem
AUTH_SSL_SERVER_KEYFILE = /etc/grid-security/hostkey.pem
AUTH_SSL_SERVER_CADIR = /etc/grid-security/certificates
AUTH_SSL_SERVER_CAFILE =
AUTH_SSL_CLIENT_CERTFILE = /etc/grid-security/hostcert.pem
AUTH_SSL_CLIENT_KEYFILE = /etc/grid-security/hostkey.pem
AUTH_SSL_CLIENT_CADIR = /etc/grid-security/certificates
AUTH_SSL_CLIENT_CAFILE =
----------------------------------------------------------------------
[root@mini-htc ~]# cat /etc/condor-ce/config.d/50-schedd2.conf  
JOB_ROUTER_SCHEDD2_POOL = localhost:9618
----------------------------------------------------------------------

User mapping details and examples:

----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/condor-ce/mapfiles.d/
total 20
-rw-r--r--. 1 root root 1305 Dec 21 22:11 10-gsi.conf
-rw-r--r--. 1 root root 1095 Dec 21 22:11 10-scitokens.conf
-rw-r--r--. 1 root root   78 May 29 02:07 11-gsi.conf
-rw-r--r--. 1 root root   99 May 29 02:07 11-scitokens.conf
-rw-r--r--. 1 root root  540 May 29 02:06 50-gsi-callout.conf
----------------------------------------------------------------------
[root@mini-htc ~]# cat /etc/condor-ce/mapfiles.d/11-gsi.conf 
GSI /.*,\/alice\/Role=lcgadmin/ alicesgm
GSI /.*,\/alice\/Role=NULL/ alice001
----------------------------------------------------------------------
[root@mini-htc ~]# cat /etc/condor-ce/mapfiles.d/11-scitokens.conf 
SCITOKENS /^https:\/\/wlcg\.cloud\.cnaf\.infn\.it\/,8c3c01a9-ee96-4f6e-989c-ad1e279244ae$/ wlcg001
----------------------------------------------------------------------
[root@mini-htc ~]# grep GSI /etc/condor-ce/mapfiles.d/50-gsi-callout.conf | tail -n 1
#GSI /(.*)/ GSS_ASSIST_GRIDMAP
----------------------------------------------------------------------

NOTE: the GSS_ASSIST_GRIDMAP line must be commented out or removed !

Add the necessary grid job accounts:

----------------------------------------------------------------------
adduser alicesgm
----------------------------------------------------------------------
adduser alice001
----------------------------------------------------------------------
adduser wlcg001
----------------------------------------------------------------------

Result:

----------------------------------------------------------------------
[root@mini-htc ~]# tail -n 3 /etc/passwd
alicesgm:x:19984:19984::/home/alicesgm:/bin/bash
alice001:x:19985:19985::/home/alice001:/bin/bash
wlcg001:x:19986:19986::/home/wlcg001:/bin/bash
----------------------------------------------------------------------

We need to set an extra parameter for the HTCondor batch service as well:

----------------------------------------------------------------------
[root@mini-htc ~]# ll /etc/condor/config.d/
total 16
-rw-r--r--. 1 root root 1004 May 26 21:46 00-htcondor-9.0.config
-rw-r--r--. 1 root root 2501 May 26 21:46 00-minicondor
-rw-r--r--. 1 root root  451 Dec 21 22:11 50-condor-ce-defaults.conf
-rw-r--r--. 1 root root   39 May 29 02:26 99-extra.conf
----------------------------------------------------------------------
[root@mini-htc ~]# cat /etc/condor/config.d/99-extra.conf 
QUEUE_SUPER_USER_MAY_IMPERSONATE = .*
----------------------------------------------------------------------

To ensure the services will run with the specified configuration, we restart them:

----------------------------------------------------------------------
systemctl stop condor-ce
----------------------------------------------------------------------
systemctl stop condor
----------------------------------------------------------------------

Ensure all related processes have gone, start the services and check if the processes are all back:

----------------------------------------------------------------------
ps afuxwww | grep -o '.*condor[_][^ ]*'
----------------------------------------------------------------------
systemctl start condor
----------------------------------------------------------------------
systemctl start condor-ce
----------------------------------------------------------------------

The list of processes should look as shown here:

----------------------------------------------------------------------
[root@mini-htc ~]# ps afuxwww | grep -o '.*condor[_][^ ]*'
condor     95485  0.0  0.0  71632  7040 ?        Ss   16:23   0:00 /usr/sbin/condor_master
root       95528  0.0  0.0  23464  3996 ?        S    16:23   0:00  \_ condor_procd
condor     95530  0.0  0.0  44660  5952 ?        Ss   16:23   0:00  \_ condor_shared_port
condor     95531  0.0  0.0  45688  6720 ?        Ss   16:23   0:00  \_ condor_collector
condor     95532  0.0  0.0  45448  6664 ?        Ss   16:23   0:00  \_ condor_negotiator
condor     95533  0.0  0.1  46864  7632 ?        Ss   16:23   0:00  \_ condor_schedd
condor     95534  0.0  0.0  45960  7044 ?        Ss   16:23   0:00  \_ condor_startd
condor     95573  0.0  0.0  71672  5640 ?        Ss   16:23   0:00 condor_master
root       95622  0.0  0.0  23600  4036 ?        S    16:23   0:00  \_ condor_procd
condor     95623  0.0  0.0  44788  5956 ?        Ss   16:23   0:00  \_ condor_shared_port
condor     95625  0.0  0.2 188472 17568 ?        Ss   16:23   0:00  \_ condor_collector
condor     95628  0.0  0.1  46788  7496 ?        Ss   16:23   0:00  \_ condor_schedd
condor     95629  0.0  0.0  45156  6484 ?        Ss   16:23   0:00  \_ condor_job_router
----------------------------------------------------------------------

The host should now be ready for running test jobs submitted with X509 / VOMS proxies and/or SciTokens according to the implemented configuration.

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2022-05-29 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback