No permission to view FIOgroup.FabricServicesMenu
CERN FTS Services
Production
There are three FTS service instances running at CERN in production, all based on the FTS 2.1 SLC4 service. PLUS one FTS 2.2 instance used by ATLAS for T0-export production traffic.
Node Deployment
This is out of date badly - 20th March 2010.
Channel deployment
This shows which FTA agent daemons are hosted upon which nodes.
This is out of date badly - 20th March 2010.
Installation and Configuration
Nodes
The FTSes are configured with Quattor, CDB NCM-YAIM and Yaim as described in
FtsTier0CDBConfiguration21.
Backend database
The Oracle 10g database backend for all the FTS servers is run by IT-DM group on the LCG Oracle RAC:
physics-database.support@cern.ch
.
The DB connection parameters are contained in the YAIM configuration of CDB and delivered to the node via the SINDES
fts_oracle_passwd
cluster-level component.
CERN Site Information system
The EGEE.BDII is run as a resource BDII on all of the FTS web-servide nodes (
"FTS"
). The CERN-PROD site BDII should be configured to pull from these using the cluster DNS alias.
The BDII
GIP running on the FTS publishes the service discovery information and the channel information (i.e. which FTS channels the service instance runs).
It is configured by the FTS module of YAIM under CDB.
ldapsearch -x -H ldap://prod-bdii.cern.ch:2170 -b 'Mds-vo-name=CERN-PROD,o=Grid' \
'(|(|(|(|(GlueServiceType=org.glite.ChannelAgent)(GlueServiceType=org.glite.ChannelManagement))
(GlueServiceType=org.glite.Delegation))(GlueServiceType=org.glite.FileTransfer)))'
Management Procedures
All nodes are setup as standrad FIO nodes with regards to SMS, standard alarms and host certificate monitoring.
SMS states
The default SMS state is
production
.
-
FTS
webservice: Setting a node to maintenance
removes it from the load-balanced DNS alias. This should be done before any instrusive operation is performed.
-
FTA
agents. Setting a node to maintenance
currently has no effect on the daemons.
-
FTM
monitor. Setting a node to maintenance
currently has no effect on the daemons.
Monitoring
Lemon alarms
In addition to the standard FIO alarms, specific Lemon Alarms have been defined for the FTS.
Alarm name |
Description |
Comment |
TOMCAT_WRONG |
No Tomcat processes running. |
|
FTA_WRONG |
One or more of the FTA agent daemons is down. |
|
FTS_STUCK |
The FTS web-service is not responding (although Tomcat itself is up). |
|
GRID_BDII_WRONG |
No BDII processes running. |
|
These alarms, along with all standard alarms on the nodes, are handled by the operator and sysadmin teams. the procedures are all stored in OPM
FTS Intervention Plans
Detailed plans for upcoming FTS interventions and records of minor interventions:
FtsInterventions.
WCLG Service Review for FTS
The WLCG service review of FTS at CERN is described in
FtsServiceReview20, with CERN-specific comments at
FtsServiceReview20CERNPROD. They describe the impact to the overall FTS service of different failure types.