CENF Web>Computing>NeutrinoPlatformCAF (2018-03-05, NectarB)

Welcome to the CERN-Tier0 Analysis Facility (CAF) for (p)DUNE Users TWiki Home page

General
Introduction
Tier0-core/nodes/EOS access
- Software Installation
- Submitting jobs to NP Tier0 cores
CAF system accounts
- For the HTCondor schedds:
  - CAF batch groups with priority shares
Data Model
Useful Commands
Monitoring
- Batch Monitoring for NP02
- Batch Monitoring for NP04
Operations
EOS and TAPE
Miscellanea
Useful links
Contacts

General

(p)DUNE Analysis Facility located at CERN-Tier0 to provide fast response to latency critical activities:

Diagnostic of detector problems
Prompt alignment and calibration
- export new constants to Tier-0 and other computing centers worldwide (FNAL) for future data reprocessing
performance services
Hot physics analysis

The task of the (p)DUNE Tier-0 system is to perform the prompt reconstruction of the raw data coming from the on-line data acquisition system, and to register raw and derived data with the File Transfer Service ( FTS) system, which then distributes them to the FNAL centre and beyond.

Introduction

You need to be CERN registered and have a CERN account.
- About the Account Management service
- Keep in mind: AFS is being phased out (end of 2017) at CERN, you might want to look into CERNBOX or EOS as a storage solution. To activate your CERNBox personal storage space (1 TB, up to 1 million files and the maximum size of a single file is 8GB, hosted in the CERN Computer Center): login to https://cernbox.cern.ch (using your CERN account and password). More info here.
- For NP Project and its prototype experiments follow the link.
- FYI : There is also client for mobile phones and fixed computers that allow sync. For installation and configuration use the link
Be aware you must choose np-comp UNIX computing group
- https://resources.web.cern.ch/resources/Manage/Linux/Settings.aspx?login=
Registered to the corresponding e-group (depending on which protoDUNE activity you're member):
- np02-t0comp-users or
- np04-t0comp-users or
- np-comp if you do not belong to none of the above.
if you want to access Neutrino Platform experiment(s) EOS space, follow this link.
For any problem you may have, please send email to neutplatform computing support

Tier0-core/nodes/EOS access

Neutrino Platform and the prototype experiments at CERN, NP02/ NP04, have dedicated cores and space from CERN Tier0. By now what it is available : 1 PB EOS, 6 PB tape and 1500 cores and from August onwards NP will provide 3 PB of EOS disks space. The machines are the normal batch worker nodes, 2 GB memory per core, and the jobs will run together with other jobs on the batch farm. The batch system is based on HTCondor. The CPUs are a mix of new and not so new, typically less then 3 years old. The typical machine size is 8 cores. For more information how to access NP experiments EOS space, follow this link.

Software Installation

Submitting jobs to NP Tier0 cores

After login to lxplus cluster, you can proceed with install larsoft/dunetpc with the same way as in neutplatform cluster. See instructions 1, 2.

You can have a set of examples scripts located at the NP gitlab repository. If you have problems accessing it let me know.

To submit the condor job: 
condor_submit nptest_htcondorjob.sub
Have a look at the self-explanatory comments of the scripts.
To examine the running jobs, you have several options. 
The closest analog to "bpeek" (lxbatch system) is "condor_tail <jobID>" , 
which can be used to inspect the standard out (or other files condor knows about) 
of running jobs. 
Or you can use "condor_ssh_to_job  <jobID>" which drops you into the same sandbox as the running job, 
allowing you to inspect as you see fit. 

You can also have EOS for the input/output/log. 
 
Keep in mind: "condor_submit -spool" will just take your files and submit them to the schedd, 
and won't write to them in the meantime. 
You then can retrieve them when your job completes using "condor_transfer_data".
You can move output files at the end of your command by just 
having the script you submit as the "executable" do it for you

To monitor batch NP02/NP04 jobs , follow the link1, link2 respectively.

A collection of useful HTCondor commands, one can find here. For more information have a look at the Quick Start Guide form CERN HTCondor.

CAF system accounts

Specific CAF subsystem accounts and job priorities scheduler/queues:

For the HTCondor schedds:

The standard condor_schedds that IT has are also not available for login because they hold people's credentials - but are load-balanced, so in principle should be fine. There is an option later, if we all agree/want, to run our own schedd (I'm in favor of this option). This can be handy if, for example, lots of production jobs are submitted from the same machine - the local schedd gives a much faster response.

CAF batch groups with priority shares

The following lxbatch batch groups with priority shares are available for the systems and combined performance groups. The batch group managers are responsible for adding and removing members.

Detector system	Batch group name (bugroup)	Batch group manager

Performance group	Batch group name (bugroup)	Batch group manager

Data Model

Useful Commands

Monitoring

CERN's HTCondor monitoring is here. The following monitoring link can show if we have users using our resources.

Batch Monitoring for NP02

HTCondor monitoring for NP02

Batch Monitoring for NP04

HTCondor monitoring for NP04

Operations

EOS and TAPE

For more information how to access NP experiments EOS space, follow this link.
For more information how to access NP experiments CASTOR (CERN Advanced STORage manager), a data tape storage system used at CERN, follow this link.

Miscellanea

Useful links

Contacts

Tier-0 contacts:

Tier-0 expert-on-call phone:
E-mail: np-tz-ops at cern.ch

In case of problems with the on-call phone, contact the experts directly:

Nektarios Benekos:

Back to Neutrino Platform Computing Twiki Main Page

Major updates:

-- NectarB - 2017-02-03

Topic revision: r15 - 2018-03-05 - NectarB

CENF

Public webs

- Cern Search
- TWiki Search
- Google Search
CENF All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback