LCG Production Services -
LCG Grid Deployment
How to (re-)install a machine from scratch ?
You can also take a look on the wiki page written by
Louis.Poncet@cernNOSPAMPLEASE.ch for the
certification testbed.
Steps to follow
First you should have access (read + execution) to the following directory from
lxplus or
lxadm:
/afs/cern.ch/project/gd/yaim-server
There is now only one script to use:
- /afs/cern.ch/project/gd/yaim-server/install.sh:
This script is used to install or re-install a machine with a given operating system kickstart template (corresponding to an OS). This step should take at most 15 minutes.
Syntax: install.sh <os> <node(s)>
where:
- <os> could be:
- sl3-sec: operating system SLC3 with security packages lcg-fw and gd-auth already installed.
- sl4-sec: operating system SLC4 with security packages lcg-fw and gd-auth already installed.
- sl4-lxb6-sec: operating system SLC4 with security packages lcg-fw and gd-auth already installed for lxb61xx nodes only.
- sl4-lxb6-sec-64: operating system SLC4 with security packages lcg-fw and gd-auth already installed for 64 bits nodes only.
All the other kickstart templates given by
install.sh are no more supported since the
firewall and
authentification settings are not set. If you need a specific kickstart template, please contact
Yvan.Calas@cernNOSPAMPLEASE.ch or
Louis.Poncet@cernNOSPAMPLEASE.ch.
- <node(s)> is a list of machines to install.
Then you have to enter your AFS login and your NICE password. You could check that all the machines have been well registered with the following command:
[lxplus066] /afs/cern.ch/project/linux/redhat/kickstart/bin > ./aims show gdrb05
09:48:40 lxnfs4 > name: gdrb05 hw: 00:02:B3:AF:90:80 ks:cfg/lcg/gdrb05.cfg pxe: SLC308_i386 if: link status: OK
You then have to reboot the machine(s) in order to begin the installation.
!!! WARNING !!!: the installation of a machine is under your responsibility. Please check that you install a machine belonging to you first (see the
node status web page first). Please read the
FAQ below if you want to unregister a machine.
Once the machine has been reinstalled, you can login to it using your ssh public key or kerberos authentification.
Important:
- The installation of a SL3 node is not straightforward. You must first execute the /root/RUN_ME_FIRST in order to:
- Install an AFS client.
- Remove some useless packages.
- Upgrade the kernel if needed (the machine will be automatically rebooted).
- The post-install script must not be used anymore.
Special instructions for nodes in the lxb173x or adc00xx series for SLC3
The installation of
SLC3 on these nodes needs to be dealt with in a special way:
- You need an AFS client running on the node:
apt-get -y install ccdb-tools openafs openafs-server openafs-client openafs-compat \
openafs-kpasswd openafs-krb5 kernel-module-openafs-`uname -a | cut -d " " -f3` lcm compat-db
export PERLLIB=/usr/lib/perl
lcm --configure afsclt krb5clt krb4clt srvtab
/etc/init.d/afs start
/afs/cern.ch/project/linux/redhat/kickstart/bin/kickstart-me -k KICKSTART_FILE IMAGE_FILE=
The KICKSTART_FILE must be reachable from AFS or from a HTTP server. Please contact
Yvan.Calas@cernNOSPAMPLEASE.ch for example to enable access to the kickstart file related to your machine. For example, if you want to install slc308 on
lxb1734, you have to execute the following command line on the node itself:
/afs/cern.ch/project/linux/redhat/kickstart/bin/kickstart-me -k http://lxb2007.cern.ch/kickstart/lxb1734.cfg slc308
Important: The URL found in the kickstart file must be the same than the one you specify for IMAGE_FILE. Otherwise, it will not work.
- Detailed instructions on how to use kickstart-me can be found here.
- Then reboot the machine to reinstall
- Wait
- Log in to the machine using the well known password.
- Execute the post-install script. (Be aware that the third password asked is not the general one but the one defined at cluster level)
- Note that for these machines, the GD firewall root access rules are not set at the end of the installation. You must:
- Install the packages lcg-fw and gd-auth packages (see instruction here and here respectively).
- Execute by hand the following cron jobs: /etc/cron.hourly/firewall and /etc/cron.hourly/auth.
FAQ: troubles during the installation
I can not execute the install.sh
Please contact
Yvan.Calas@cernNOSPAMPLEASE.ch or
Louis.Poncet@cernNOSPAMPLEASE.ch.
I made an error in the name of the machine to reinstall. How can I de-register this machine ?
If you want to de-register lxb2011, please follow the following steps:
[lxplus059-13:02:58] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims show lxb2011
13:02:58 lxnfs1 > name: lxb2011 hw: 00:D0:B7:B8:54:22 ks:cfg/lcg/lxb2011.cfg pxe: SLC305_i386 if: link status: OK
[lxplus059-13:02:59] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims pxeoff lxb2011
13:33:20 lxnfs1 > Turned on IT-CS HCP replies for lxb2011
[lxplus059-13:33:21] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims show lxb2011
13:33:34 lxnfs1 > name: lxb2011 hw: 00:D0:B7:B8:54:22 ks:cfg/lcg/lxb2011.cfg pxe: disabled status: OK
15 minutes after the beginning of the installation, I cannot access to this machine but I can ping it:
That generally means that there was a problem at the beginning or during the installation of the OS. You need to go to the Computer Center (CC) to know what is wrong (see messages on screen: ALT+F1, ALT+F2, ALT+F3, ALT+F4, ALT+F5). Please reboot the machine and check if this problem occurs again. If it is still the case, please contact
Yvan.Calas@cernNOSPAMPLEASE.ch or
Louis.Poncet@cernNOSPAMPLEASE.ch.
At the end of the installation, another installation starts again on the same machine:
That means that the machine you want to (re-)install has not been correctly unregistered. You have to unregistered it manually during the installation:
- either from the machine in the CC: ALT+F2, then telnet linuxsoft.cern.ch 7777.
- either from lxplus or lxadm: /afs/cern.ch/project/linux/redhat/kickstart/bin/aims pxeoff hostname
Other problems
Please contact
Yvan.Calas@cernNOSPAMPLEASE.ch or
Louis.Poncet@cernNOSPAMPLEASE.ch