This page describes how to maintain DM/WM software manage script. All the code resides in DM/WM github deployment. For each service there is a sub-directory which contains the deployment script deploy, server management script manage, and any configuration files except security-sensitive authnetication data.
The scripts are designed to resemble linux /etc/init.d service management scripts.
Please do not commit any changes directly. Please submit tested changes as per dev-git guidelines. All changes pertaining to the service – deploy scripts, manage scripts, monitoring specs and server configuration files – should be supplied in the patch series. We will review your changes, and may request changes. Once the changes are approved, we will commit them to github for you. You should assume progressive clean-up of your deployment action can happen on every new server version submission.
The management scripts are used to start and stop services, and to get service status. Where appropriate the script should allow specific server sub-components to be controlled separately. The script also provides a succinct help to give users an overview of what it can do; more complete details will should be given on service-specific wiki pages. The management operations should be specific to one exact service, not all services of that kind; see below for implications on this.
Simply run the manage script, providing the action, and for start/stop actions a security string. The security string should be something which ensures the operator is very likely to be following copy-and-paste instructions from service documentation, not executing random scripts on the system which look executable. Examples on using the management script:
Get help or status, or some other action not affecting the service - no security string checking:
sudo -H -u _app bashs -lc "/data/current/config/app/manage help"
sudo -H -u _app bashs -lc "/data/current/config/app/manage version"
sudo -H -u _app bashs -lc "/data/current/config/app/manage status"
Start, stop or any other action which affects the service - must use a security string check:
sudo -H -u _app bashs -lc "/data/current/config/app/manage restart 'security string'"
sudo -H -u _app bashs -lc "/data/current/config/app/manage start 'security string'"
sudo -H -u _app bashs -lc "/data/current/config/app/manage stop 'security string'"
The management script should do everything required to start, stop and get the status of the service it operates, but nothing else. The script should not touch the installation or configuration files in any way. It can however modify server state files, such as reload server configuration after an upgrade. In production the script will run under application-specific daemon account which can only read but not write to the software area and the configuration files.
Typically the script prepares the environment for the service, and then runs commands affecting the service: to start it, to stop it, or to check its status. More complete requirements are listed below:
Use the template below for a new management script.
Always test new scripts in your private test installation, preferably using a developer virtual machine with realistic multi-account setup. Exercise all actions supported by your management script and make sure they perform the desired actions and nothing else. You can use sh -x manage <options> to be sure of exact details executed. Verify it doesn’t attempt to operate servers not under its management.
It is highly recommended to verify the following deployment / reboot / server management combinations all work. The server should restart automatically on each reboot, and should do the “right thing”, e.g. if rebooted after an upgrade.
#!/bin/sh
##H Usage: manage ACTION [SECURITY-STRING]
##H
##H Available actions:
##H help show this help
##H version get current version of the service
##H status show current service's status
##H sysboot start server from crond if not running
##H restart (re)start the service
##H start (re)start the service
##H stop stop the service
##H
##H For more details please refer to operations page:
##H https://twiki.cern.ch/twiki/bin/view/CMS/<twiki-page>
if [ $(id -un) = cmsweb ]; then
echo "ERROR: please use another account" 1>&2
exit 1
fi
ME=$(basename $(dirname $0))
TOP=$(cd $(dirname $0)/../../.. && pwd)
ROOT=$(cd $(dirname $0)/../.. && pwd)
CFGDIR=$(dirname $0)
LOGDIR=$TOP/logs/$ME
STATEDIR=$TOP/state/$ME
COLOR_OK="\\033[0;32m"
COLOR_WARN="\\033[0;31m"
COLOR_NORMAL="\\033[0;39m"
. $ROOT/apps/<app-name>/etc/profile.d/init.sh
# Start service conditionally on crond restart.
sysboot()
{
if [ $(pgrep -u $(id -u) -f "<PROCESS-PATTERN>" | wc -l) = 0 ]; then
start
fi
}
# Start the service.
start()
{
cd $STATEDIR
echo "starting $ME"
<RUN-THE-SERVER> </dev/null 2>&1 |
rotatelogs $LOGDIR/<APP>-%Y%m%d.log 86400 >/dev/null 2>&1 &
}
# Stop the service.
stop()
{
echo "stopping $ME"
for PID in $(pgrep -u $(id -u) -f "<PROCESS-PATTERN>" | sort -rn); do
PSLINE=$(ps -o pid=,bsdstart=,args= $PID |
perl -n -e 'print join(" ", (split)[0..6])')
echo "Stopping $PID ($PSLINE):"
kill -9 $PID
done
}
# Check if the server is running.
status()
{
pid=$(pgrep -u $(id -u) -f "<PROCESS-PATTERN>" | sort -n)
if [ X"$pid" = X ]; then
echo -e "$ME is ${COLOR_WARN}NOT RUNNING${COLOR_NORMAL}."
else
echo -e "$ME is ${COLOR_OK}RUNNING${COLOR_NORMAL}, PID" $pid
fi
}
# Verify the security string.
check()
{
CHECK=$(echo "$1" | md5sum | awk '{print $1}')
if [ $CHECK != 94e261a5a70785552d34a65068819993 ]; then
echo "$0: cannot complete operation, please check documentation." 1>&2
exit 2;
fi
}
# Main routine, perform action requested on command line.
case ${1:-status} in
sysboot )
if ps -oargs= $PPID | grep -q crond; then
sysboot
else
echo "$0: sysboot is for cron only" 1>&2
exit 1
fi
;;
start | restart )
check "$2"
stop
start
status
;;
status )
status
;;
stop )
check "$2"
stop
status
;;
help )
perl -ne '/^##H/ && do { s/^##H ?//; print }' < $0
;;
version )
echo "$<PROJECT-NAME>_VERSION"
;;
* )
echo "$0: unknown action '$1', please try '$0 help' or documentation." 1>&2
exit 1
;;
esac