This page defines
- The products involved in the SC4 service definitions
- How the products map to services
- The service parameters
It does not cover the technical aspects of how the service is to be delivered (see
ScFourServiceTechnicalFactors for this information).
Service Class
The Service Class is a set of parameters which share the same service level objectives. It permits an easy way of describing the high level parameters required for a service (such as the
BDII is a class C rather than the
BDII requires 99% availability with 1 hour response time...)
Class |
Description |
Downtime |
Reduced |
Degraded |
Avail |
C |
Critical |
1 hour |
1 hour |
4 hours |
99% |
H |
High |
4 hours |
6 hours |
6 hours |
99% |
M |
Medium |
6 hours |
6 hours |
12 hours |
99% |
L |
Low |
12 hours |
24 hours |
48 hours |
98% |
U |
Unmanaged |
None |
None |
None |
None |
where
- downtime defines the time between the start of the problem and restoration of service at minimal capacity (i.e. basic function but capacity < 50%)
- Reduced defines the time between the start of the problem and the restoration of a reduced capacity service (i.e. >50%)
- Degraded defines the time between the start of the problem and the restoration of a degraded capacity service (i.e. >80%)
- Availability defines the sum of the time that the service is down compared with the total time during the calendar period for the service. Site wide failures are not considered as part of the availability calculations. 99% means a service can be down up to 3.6 days a year in total. 98% means up to a week in total.
- None means the service is running unattended
The service class structure is based on the MoU values.
Calendars
These define the times during the day and in the cycle of accelerator operations. Required service levels change depending on the time of day or year due to changing usage of the systems and the trade-off between cost and reliability.
where
- prime shift is defined as 08:00-18:00 working days
- second shift is anytime outside of prime shift
The calendar does not address the change of service levels from the early (September 2006) to full production (2007). Service Levels should be defined for the full production state.
Products
A product consists of a set of hardware and software can provide functions for which the availability can be measured. Products make no distinction between development, test or production or between the users. A product does not have a service level associated with them since this is performed at the Service level.
Product Name |
Short Code |
Purpose |
Dvl Org |
Dvl Contact |
Resource Broker |
RB |
Farms out jobs to sites+logging and book-keeping |
|
David Smith |
MyProxy |
PX |
Renew/acquire credentials |
|
Maarten Litmaath |
BDII |
BDII |
Grid information system |
|
Laurence Field |
ComputeElement |
CE |
Gateway to local batch system |
|
|
R-GMA |
RGMA |
Grid Monitoring |
|
Laurence Field |
Monbox |
MONB |
Grid Monitoring (including Capacity and performance data archiver) |
|
Laurence Field |
Grid View |
GRVW |
Monitoring |
|
|
Site Functional Tester |
SFT |
Regular tests of components per site |
|
Piotr Nyczyk, Judit Novak |
Grid Peek |
GRPK |
Storage of outputs of running jobs |
|
Patricia Mendez |
VOMS |
VOMS |
Manages mapping of User / Roles / VO |
|
Maria Dimou |
LCG File Catalog |
LFC |
Maps file names to storage locations |
|
Jean-Philippe Baud, Sophie Lemaitre |
File Transfer Service |
FTS |
Reliable file transfer delivery |
|
fts-support@cernNOSPAMPLEASE.ch |
Storage Element |
SE |
SRM Compatible Storage Service |
|
|
where
- Product Name is the usual english words used to describe the code
- Short Code is a short name which can be used for naming conventions such as hostnames or commands
- Purpose describes what functions the product delivers
- Dvl Org is the organisation responsible for development of the software and enhancements
- Dvl Contact is the primary contact within the Dvl Org
Customers
These are the users of the service.
Services
A service is constructed from a product (desribing what application), a customer (who uses it) and an instance (what is it for).
Service |
Instance |
Product |
Customer |
Class AP |
Class AS |
Class OP |
Class OS |
Sup Org |
Support Contact |
RBP |
Production Resource Broker at CERN |
RB |
SH |
C |
C |
C |
C |
|
David Smith |
PXP |
Production My Proxy at CERN |
PX |
SH |
C |
C |
C |
C |
|
Maarten Litmaath |
BDIIP |
Production BDII for Grid Information System |
DBII |
SH |
C |
C |
C |
C |
|
Laurence Field |
BDIIS |
Production Site BDII at CERN |
DBII |
SH |
H |
H |
H |
H |
|
Laurence Field |
CEP |
Production Compute Element at CERN |
CE |
SH |
C |
C |
C |
C |
IT/FIO |
Thorsten Kleinwort |
RGMAP |
Production R-GMA at CERN |
RGMA |
SH |
M |
M |
M |
M |
|
Laurence Field |
MONBP |
Production Monbox at CERN |
MONB |
SH |
M |
M |
M |
M |
|
Laurence Field |
GRVWP |
Production Grid View at CERN |
GRVW |
SH |
M |
L |
M |
L |
|
|
SFTP |
Production Site Functional Tester |
SFT |
SH |
M |
M |
M |
M |
|
Piotr Nyczyk |
GRPKP |
Production Grid Peek Service |
GRPK |
SH |
M |
M |
M |
M |
|
Patricia Mendez |
VOMSP |
Production VOMS |
VOMS |
SH |
C |
C |
C |
C |
|
Maria Dimou |
LFCP-ALICE |
Alice Production LCG File Catalog |
LFC |
Z2 |
H |
H |
H |
H |
|
|
LFCP-ATLAS |
Atlas Production LCG File Catalog |
LFC |
ZP |
H |
H |
H |
H |
|
|
LFCP-CMS |
CMS Production LCG File Catalog |
LFC |
ZH |
H |
H |
H |
H |
|
|
LFCP-LHCB |
LHCb Production LCG File Catalog |
LFC |
Z5 |
C |
C |
C |
C |
|
|
FTSP |
Productuion file transfer service |
FTS |
Z2,ZP,Z5 |
C |
C |
C |
C |
|
|
SEP |
Production CastorGrid and Castor |
SE |
SH |
C |
C |
C |
C |
|
|
where
- Product defines the code which provides the visible function of the service
- Customer is the primary group using the service. SH (shared) implies the service is used by all experiments
- Class for each of the calendar windows defines what service class is expected for each of the times defined. Where current software restrictions limit the service possible to deliver, an entry such as M->H should be used to show that the best that can currently be delivered is medium but the requirement is for H. The SC4 tests can be used to validate the feasibility of the requested service level.
- Sup Org is the organisation providing support for the service who will cover problem resolution and deployment
- Support Contact is the primary contact point in the Sup Org for the service
The MoU minimum requirements for second shift would result in L in the columns for AS and OS. This has been ignored for the purposes of the first version but should be reviewed.
Maintenance Window
For planned changes, the following windows are defined during SC4. Maintenance windows are periods during which the services may run at reduced capacity (class C or H) or unavailable (class M or L) without being considered as downtime in the availability calculations.
These maintenance windows are used for operations such as
- Software upgrades
- System reboots where required
The actual window times will be defined later in the SC4 planning.
--
TimBell - 05 Sep 2005