LCG Production Services - LCG Grid Deployment

How to (re-)install a machine from scratch ?

You can also take a look on the wiki page written by Louis.Poncet@cernNOSPAMPLEASE.ch for the certification testbed.

Steps to follow

First you should have access (read + execution) to the following directory from lxplus or lxadm: /afs/cern.ch/project/gd/yaim-server

There is now only one script to use:

  • /afs/cern.ch/project/gd/yaim-server/install.sh:

This script is used to install or re-install a machine with a given operating system kickstart template (corresponding to an OS). This step should take at most 15 minutes.

Syntax: install.sh <os> <node(s)> where:

  • <os> could be:
    • sl3-sec: operating system SLC3 with security packages lcg-fw and gd-auth already installed.
    • sl4-sec: operating system SLC4 with security packages lcg-fw and gd-auth already installed.
    • sl4-lxb6-sec: operating system SLC4 with security packages lcg-fw and gd-auth already installed for lxb61xx nodes only.
    • sl4-lxb6-sec-64: operating system SLC4 with security packages lcg-fw and gd-auth already installed for 64 bits nodes only.
All the other kickstart templates given by install.sh are no more supported since the firewall and authentification settings are not set. If you need a specific kickstart template, please contact Yvan.Calas@cernNOSPAMPLEASE.ch or Louis.Poncet@cernNOSPAMPLEASE.ch.

  • <node(s)> is a list of machines to install.

Then you have to enter your AFS login and your NICE password. You could check that all the machines have been well registered with the following command:

     [lxplus066] /afs/cern.ch/project/linux/redhat/kickstart/bin > ./aims show gdrb05
     09:48:40 lxnfs4 > name: gdrb05 hw: 00:02:B3:AF:90:80 ks:cfg/lcg/gdrb05.cfg pxe: SLC308_i386 if: link status: OK

You then have to reboot the machine(s) in order to begin the installation.

!!! WARNING !!!: the installation of a machine is under your responsibility. Please check that you install a machine belonging to you first (see the node status web page first). Please read the FAQ below if you want to unregister a machine.

Once the machine has been reinstalled, you can login to it using your ssh public key or kerberos authentification.

Important:

  • The installation of a SL3 node is not straightforward. You must first execute the /root/RUN_ME_FIRST in order to:
    • Install an AFS client.
    • Remove some useless packages.
    • Upgrade the kernel if needed (the machine will be automatically rebooted).
  • The post-install script must not be used anymore.

Special instructions for nodes in the lxb173x or adc00xx series for SLC3

The installation of SLC3 on these nodes needs to be dealt with in a special way:

  • You need an AFS client running on the node:
apt-get -y install ccdb-tools openafs openafs-server openafs-client openafs-compat \
openafs-kpasswd openafs-krb5 kernel-module-openafs-`uname -a | cut -d " " -f3` lcm compat-db
export PERLLIB=/usr/lib/perl
lcm --configure afsclt krb5clt krb4clt srvtab
/etc/init.d/afs start

  • Log on the node and run
/afs/cern.ch/project/linux/redhat/kickstart/bin/kickstart-me -k KICKSTART_FILE IMAGE_FILE=

The KICKSTART_FILE must be reachable from AFS or from a HTTP server. Please contact Yvan.Calas@cernNOSPAMPLEASE.ch for example to enable access to the kickstart file related to your machine. For example, if you want to install slc308 on lxb1734, you have to execute the following command line on the node itself:

/afs/cern.ch/project/linux/redhat/kickstart/bin/kickstart-me -k http://lxb2007.cern.ch/kickstart/lxb1734.cfg slc308

Important: The URL found in the kickstart file must be the same than the one you specify for IMAGE_FILE. Otherwise, it will not work.

  • Detailed instructions on how to use kickstart-me can be found here.
  • Then reboot the machine to reinstall
  • Wait wink
  • Log in to the machine using the well known password.
  • Execute the post-install script. (Be aware that the third password asked is not the general one but the one defined at cluster level)
  • Note that for these machines, the GD firewall root access rules are not set at the end of the installation. You must:
    • Install the packages lcg-fw and gd-auth packages (see instruction here and here respectively).
    • Execute by hand the following cron jobs: /etc/cron.hourly/firewall and /etc/cron.hourly/auth.

FAQ: troubles during the installation

I can not execute the install.sh

Please contact Yvan.Calas@cernNOSPAMPLEASE.ch or Louis.Poncet@cernNOSPAMPLEASE.ch.

I made an error in the name of the machine to reinstall. How can I de-register this machine ?

If you want to de-register lxb2011, please follow the following steps:

[lxplus059-13:02:58] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims show lxb2011
13:02:58 lxnfs1 > name:     lxb2011 hw: 00:D0:B7:B8:54:22 ks:cfg/lcg/lxb2011.cfg pxe: SLC305_i386 if: link status: OK
[lxplus059-13:02:59] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims pxeoff lxb2011
13:33:20 lxnfs1 > Turned on IT-CS HCP replies for lxb2011
[lxplus059-13:33:21] /afs/cern.ch/project/linux/redhat/kickstart/bin > aims show lxb2011
13:33:34 lxnfs1 > name:     lxb2011 hw: 00:D0:B7:B8:54:22 ks:cfg/lcg/lxb2011.cfg pxe: disabled status: OK

15 minutes after the beginning of the installation, I cannot access to this machine but I can ping it:

That generally means that there was a problem at the beginning or during the installation of the OS. You need to go to the Computer Center (CC) to know what is wrong (see messages on screen: ALT+F1, ALT+F2, ALT+F3, ALT+F4, ALT+F5). Please reboot the machine and check if this problem occurs again. If it is still the case, please contact Yvan.Calas@cernNOSPAMPLEASE.ch or Louis.Poncet@cernNOSPAMPLEASE.ch.

At the end of the installation, another installation starts again on the same machine:

That means that the machine you want to (re-)install has not been correctly unregistered. You have to unregistered it manually during the installation:

  • either from the machine in the CC: ALT+F2, then telnet linuxsoft.cern.ch 7777.

  • either from lxplus or lxadm: /afs/cern.ch/project/linux/redhat/kickstart/bin/aims pxeoff hostname

Other problems

Please contact Yvan.Calas@cernNOSPAMPLEASE.ch or Louis.Poncet@cernNOSPAMPLEASE.ch

Edit | Attach | Watch | Print version | History: r8 < r7 < r6 < r5 < r4 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r8 - 2007-02-19 - YvanCalas
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback