WLCG Critical Services

Introduction

This page lists per LHC experiment the set of services that are:

  • not operated by its own personnel, and
  • deemed critical for the successful operation of
    its grid workflows and for related activities.

Most of those services are hosted and operated by CERN-IT, while several Tier-1 sites and other partners also provide some.

For every relevant service, each experiment has provided indications of the effects of the service being unavailable. The impact indicates the effect on operations or people if the service were unavailable for a few days. The urgency indicates how quickly that impact would be reached. The criticality is defined as the product of urgency and impact. At the right hand side there are columns for the maximum criticality of a service across the experiments, the sum of the criticalities across the experiments and the weighted maximum criticality. The latter ranks services with identical maximum criticalities according to their respective sums of criticalities. Each numeric column can be sorted in ascending (descending) order by clicking once (twice) on its header.

Impact on operations and/or people

Level Definition
10 ops/VO severely affected
7 ops/VO notably affected
4 ops/VO moderately affected

Urgency levels

Level Definition
10 full impact reached within 6 hours
7 full impact reached within 1 day
4 full impact reached within 2 days
1 full impact reached after 2 days

Criticality levels

As a visual aid, 3 criticality ranges have been defined with distinct colors.
For a given experiment and for the maximum across the experiments, the ranges are as follows:

top
70-100
high
40-69
moderate
0-39
For the sum of the criticalities across the experiments:

top
210-400
high
120-209
moderate
0-119
The colors for the weighted maximum values correspond to those of the maximum values across the experiments.

Purpose of the tables

These tables are meant to clarify which services require which level of attention in their implementation and operation, to try and minimize the effects of service unavailability on the experiments, to the extent feasible. For example, a highly critical service should, if possible, be implemented and monitored in a more robust way than a less critical service. HA deployment methods, load-balancing and/or hot standby setups should be considered for such cases.

These tables do not make any promises about the level of support that can be expected for a given service: unless a specific arrangement was made for a particular service, the support level is best-effort for any service, though in practice it usually is compatible with the actual criticalities of the given service. If not, the service implementation and operation can be looked into.

CERN-IT services

Links go to a page indicating how to contact the right support unit in GGUS or SNow for the given service.

Service SNow FE/SE urg imp crit urg imp crit urg imp crit urg imp crit   max sum wtd
    ALICE ATLAS CMS LHCb   crit crit max
Px-CC network Datacenter-Network 7 10
70
7 10
70
4 10
40
10 10
100
 
100
280
1280
LHC-OPN / LHC-ONE / GPN Datacenter-Network 7 10
70
7 10
70
7 10
70
7 10
70
 
70
280
980
Oracle online oracle-database 10 10
100
10 10
100
10 10
100
10 10
100
 
100
400
1400
Oracle offline (inc. streaming) oracle-database 4 7
28
10 10
100
7 10
70
10 10
100
 
100
298
1298
DB-on-Demand db-on-demand    
0
7 10
70
4 10
40
10 10
100
 
100
210
1210
CTA CTA-service 4 7
28
7 7
49
4 7
28
4 7
28
 
49
133
623
EOS eos-service 7 10
70
7 7
49
7 10
70
7 7
49
 
70
238
938
FTS FTS    
0
10 10
100
4 7
28
4 10
40
 
100
168
1168
Global xrootd redirector eos-service    
0
   
0
7 7
49
   
0
 
49
49
539
Ceph Ceph-Service    
0
10 10
100
4 7
28
10 10
100
 
100
228
1228
CVMFS Stratum-0 cvmfs 7 10
70
7 10
70
4 7
28
4 10
40
 
70
208
908
CVMFS Stratum-1 cvmfs 4 7
28
7 4
28
4 7
28
7 10
70
 
70
154
854
Frontier and Squid cvmfs    
0
7 7
49
7 10
70
   
0
 
70
119
819
Batch service LXBATCH 7 7
49
7 7
49
4 7
28
4 7
28
 
49
154
644
Dedicated batch LXBATCH    
0
7 7
49
10 7
70
   
0
 
70
119
819
CE LXBATCH 7 7
49
7 7
49
4 4
16
4 7
28
 
49
142
632
IAM WLCG-IAM 4 10
40
7 10
70
4 10
40
7 10
70
 
70
220
920
VOMS VOMS 4 10
40
7 10
70
4 10
40
7 10
70
 
70
220
920
MyProxy MyProxy 4 10
40
4 4
16
4 10
40
   
0
 
40
96
496
CRIC cric 1 4
4
7 7
49
4 4
16
1 4
4
 
49
73
563
WAU / WSSA WLCG-WAU
WLCG-WSSA
1 4
4
1 4
4
   
0
1 4
4
 
4
12
52
BDII BDII    
0
   
0
   
0
   
0
 
0
0
0
Monit monitoring 1 4
4
7 7
49
7 7
49
7 7
49
 
49
151
641
SiteMon WLCG-Experiment-Probe-Submission 1 4
4
4 4
16
7 7
49
1 4
4
 
49
73
563
AI cloud services cloud-infrastructure
Configuration-Management
dns-load-balancing
4 7
28
10 10
100
7 7
49
10 10
100
 
100
277
1277
Kubernetes cloud-infrastructure    
0
10 10
100
7 7
49
   
0
 
100
149
1149
Lxplus LXPLUS 4 7
28
7 7
49
7 7
49
10 7
70
 
70
196
896
AFS AFS    
0
7 7
49
7 10
70
   
0
 
70
119
819
GitLab version-control 7 7
49
7 4
28
7 7
49
7 10
70
 
70
196
896
JIRA JIRA-ITS 4 4
16
7 4
28
4 4
16
4 4
16
 
28
76
356
Twiki twiki 1 4
4
7 4
28
7 7
49
4 4
16
 
49
97
587
Indico indico 1 4
4
7 7
49
4 7
28
7 7
49
 
49
130
620
Video conf zoom 7 7
49
7 7
49
7 7
49
7 7
49
 
49
196
686
Windows terminal service windows-terminal 1 4
4
1 4
4
   
0
   
0
 
4
8
48

Services at other sites

Service   urg imp crit urg imp crit urg imp crit urg imp crit   max sum wtd
    ALICE ATLAS CMS LHCb   crit crit max
GOCDB   1 4
4
4 4
16
4 4
16
7 7
49
 
49
85
575
MyOSG      
0
4 4
16
4 4
16
   
0
 
16
32
192
GGUS   1 4
4
4 4
16
7 7
49
7 4
28
 
49
97
587
FTS      
0
10 10
100
4 7
28
4 10
40
 
100
168
1168
Stratum-1   4 7
28
7 4
28
4 7
28
7 10
70
 
70
154
854
Accounting Portal   1 4
4
1 4
4
   
0
1 4
4
 
4
12
52

Previous versions

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng ALICE_crit.png r3 r2 r1 manage 19.9 K 2015-03-12 - 15:27 AndreaSciaba  
PNGpng ATLAS_crit.png r2 r1 manage 21.9 K 2015-03-12 - 15:28 AndreaSciaba  
PNGpng CMS_crit.png r2 r1 manage 20.9 K 2015-03-12 - 15:28 AndreaSciaba  
PNGpng LHCb_crit.png r2 r1 manage 22.1 K 2015-03-12 - 15:28 AndreaSciaba  
Edit | Attach | Watch | Print version | History: r42 < r41 < r40 < r39 < r38 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r42 - 2023-10-24 - ConcezioBozzi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback