SAM Maintenance
Client components
Sumbmission Client
SAM submission client for the EGEE/LCG production grid infrastructure is installed on the machine
lxn1182.cern.ch
. The machine is used to submit periodical tests as cron jobs running on behalf of individual local UNIX users.
Installed software
There are two RPMs installed on the machine:
-
lcg-sam-client
-
lcg-sam-client-sensors
The RPMs are installed from SAM APT repository (see the installation manual) under default location
/opt/lcg/same/client
Configuration
- global configuration: all the files in
/opt/lcg/same/client/etc
- local configuration (per each user):
-
~/.same.conf
- defines VO specific parameters that override the global config file
-
~/same-cron/same-cron.sh
- defines the list of sensors to be executed
-
~/.grid_cert_passphrase
- contains the passphrase of the user's certificate (used to automatically regenerate VOMS proxy)
Log files
The submission client has separate log files per each user with the cron job installed. It can be found in the following location:
~/same-cron/same-cron.log
Cron jobs
There are several UNIX users on the machine that have SAM cron job installed:
- piotr - regular submission of all sensors for OPS VO
- judit - submission of CE and gCE for DTeam VO
- pnyczyk - submission of SE, SRM, LFC, FTS sensors for DTeam VO
- piotratlas - submission of all sensors for Atlas VO
Server components
All server components for the production instance of SAM are installed on the machine
lxn1181.cern.ch
which is accessible using DNS alias
lcg-sam.cern.ch
Portal
TODO
Web services
SAM web services is a single package of software that contains two web services for querying SAM database and publishing test results. Both web services are used by the Submission Client and in fact provide the only interface between the Submission Client and all the other components of SAM (DB, Portal, etc.)
Installed software
There is only one RPM installed and it provides both web services in one web application (webapp context) for Tomcat5:
The RPM requires Tomcat5 and Oracle
InstantClient to be installed and configured in the way that allows to use OCI JDBC driver for connecting Oracle services at CERN under Tomcat. Please see the installation manual for more details.
Configuration
SAM web services are configured by customizing two files:
-
/etc/tomcat5/Catalina/localhost/same-ws.xml
- custom file based on the template in /opt/lcg/same/server/ws/same-ws.xml.template
- DB connection parameters for query web service
- location of the configuration file for the publishing web service (
gview.configuration
)
- (optionally) logging settings (
log4j.configuration
)
-
/opt/lcg/same/server/ws/gridview.properties
- DB connection parameters for the publishing web service
Please make sure that after making any changes all the files above are readable by
tomcat4
user. The default settings of the privileges are the following:
-rw-r----- 1 root tomcat4
After making any changes to the configuration files please restart Tomcat.
Log files
SAM web services by default write all logging information to a single file:
/var/log/tomcat5/lcg-sam-server-ws.log
By default only errors and important events are logged.
Daemons
SAM web services are running under Tomcat5 application server. To start, stop or restart please use the following command:
service tomcat5 start|stop|restart
Outstanding actions
After upgrading to the new version of SAM web services from the RPM please remove the relevant web application directory by using the following command:
rm -rf /var/lib/tomcat5/webapps/same-ws
. Afterwards please restart Tomcat.
BDII synchronization script
The role of the
BDII synchronization script is to discover sites and service nodes from top-level
BDII(s) and merge the acquired information with the data provided by the GOC DB. The results are stored in SAM data base.
Installed software
There is only one RPM that provides the
BDII synchronization script together with few other components:
It requires python, cx_Oracle (Oracle driver for Python) and Oracle
InstantClient to be installed and configured in the way that allows to connect Oracle services at CERN from python. Please see the installation manual for more details.
Configuration
There are several configuration files for the script located at
/opt/lcg/same/server/db/cron
. The configuration files are the following:
-
b2o-prod.conf
- for discovery of services belonging to "Certified" sites that are in "Production"
-
b2o-pps.conf
- for discovery of services belonging to "Certified" sites that are in "PPS"
-
b2o-ctb.conf
- for discovery of sites and service in the Certification Testbed (currently inactive - no cron job)
There are some other configuration files in the directory but they are used for Validation and Development instance of SAM.
Log files
Currently there are the following log files available:
-
/opt/lcg/var/log/b2o-prod.log
-
/opt/lcg/var/log/b2o-pps.log
Cron jobs
All the cron jobs that trigger
BDII synchronization script are set up at the following location: /etc/cron.d/same-bdii2oracle.cron
After making any changes to the file please restart cron daemon by using the following command:
service crond restart
Availability metrics calculation script
The role of the Availability metrics calculation script is to calculate overall site/service status every hour and generate availability metrics data by calling the relevant PL/SQL procedures in SAM data base.
Installed software
There is only one RPM that provides the availability metrics calculation script together with few other components:
All the functionality is provided by two scripts in
/opt/lcg/same/server/db/cron
directory:
-
same-metric-cron
- BASH scripts that contains configuration parameters and executes the actual calculation script
-
sameSummarization
- Python script (executable) that calculates statuses and availability metrics for last hour/day and triggers recalculation of any missing data if needed (fault tolerance)
The script requires python, cx_Oracle (Oracle driver for Python) and Oracle
InstantClient to be installed and configured in the way that allows to connect Oracle services at CERN from python. Please see the installation manual for more details.
Configuration
The script is configured currently only by hardcoded parameters in
/opt/lcg/same/server/db/cron/same-metric-cron
BASH script.
Log files
The logging information for the script can be found at
/opt/lcg/var/log/same-metric-cron.log
Cron jobs
The cron job that triggers availabilty metrics calculation is set up at the following location: /etc/cron.d/same-metric-cron
After making any changes to the file please restart cron daemon by using the following command:
service crond restart
XSQL interface to SAM DB and the alarm sytem
TODO
Resources
Backup of configuration files
To make it easier to recover from SAM client or server failures we provide a backup of current configuration files. They are stored in
/root
directory on SAM machines (
lxn1180.cern.ch
,
lxn1181.cern.ch
and
lxn1182.cern.ch
) as tarballs containing the whole set of configuration files for client and server respectively:
- sam-prod-client-config.tgz
- sam-prod-server-config.tgz
For security reasons the certificate passphrase files for client are not included in the tarball.
RPM lists
This section provides list of RPMs to be installed on SAM Client and Server machines. It doesn't provide all the necessary dependencies which should be handled either by the packaging system (APT/YUM) or manually by the administrator.
Client (lxn1182.cern.ch
- SLC3 + gLite UI ver. 3.0)
-
lcg-sam-client
-
lcg-sam-client-sensors
Server (lxn1181.cern.ch
- SLC4)
-
lcg-sam-server-portal
-
lcg-sam-server-ws
-
lcg-sam-server-db
-
lcg-sam-server-xsql