INSPIRE on the Agile Infrastructure

INSPIRE is currently running on VMs (running on http://openstack.cern.ch/) and HWs machines. All machines are controlled via Puppet whose web interface is hosted at http://judy.cern.ch/ . See:

How to create a new VM

  1. Upload to your AFS home the file https://openstack.cern.ch/dashboard/project/access_and_security/api_access/openrc/ (once you have selected the GS Inspire project in OpenStack)
  2. $ ssh aiadm
  3. aiadm $ eval $(ai-rc "GS Inspire") # or "GS Inspire critical power" or --same-project-as inspireXX
  4. aiadm $ nova flavor-list
+----+------------+-----------+------+-----------+------+-------+-------------+-----------+
| ID | Name       | Memory_MB | Disk | Ephemeral | Swap | VCPUs | RXTX_Factor | Is_Public |
+----+------------+-----------+------+-----------+------+-------+-------------+-----------+
| 1  | m1.tiny    | 512       | 0    | 0         |      | 1     | 1.0         | True      |
| 2  | m1.small   | 2048      | 20   | 0         |      | 1     | 1.0         | True      |
| 20 | hep2.1     | 2048      | 70   | 20        |      | 1     | 1.0         | False     |
| 21 | hep2.2     | 4096      | 70   | 40        |      | 2     | 1.0         | False     |
| 22 | hep2.4     | 8192      | 70   | 80        |      | 4     | 1.0         | False     |
| 23 | hep2.8     | 16000     | 70   | 160       |      | 8     | 1.0         | False     |
| 3  | m1.medium  | 4096      | 40   | 0         |      | 2     | 1.0         | True      |
| 4  | m1.large   | 8192      | 80   | 0         |      | 4     | 1.0         | True      |
| 50 | win.small  | 2048      | 60   | 0         |      | 1     | 1.0         | True      |
| 51 | win.medium | 4096      | 80   | 0         |      | 2     | 1.0         | True      |
| 52 | win.large  | 8192      | 120  | 0         |      | 4     | 1.0         | True      |
+----+------------+-----------+------+-----------+------+-------+-------------+-----------+

aiadm $ nova image-list
+--------------------------------------+-------------------------------------------+--------+--------+
| ID                                   | Name                                      | Status | Server |
+--------------------------------------+-------------------------------------------+--------+--------+
| 76aacd30-5a23-4df4-91d4-b4a96a9b7638 | SLC5 CERN Server - i386 [130920]          | ACTIVE |        |
| e3496dfa-11a7-496c-a634-107d3d10b22a | SLC5 CERN Server - i386 [2014-01-30]      | ACTIVE |        |
| 4e1c1875-3b9f-48fc-b43b-7233f450800b | SLC5 CERN Server - i386 [2014-08-05]      | ACTIVE |        |
| 8c234ca2-ec89-4e7a-9733-9a228c401571 | SLC5 CERN Server - x86_64 [130920]        | ACTIVE |        |
| 8ba9f996-4399-4dbb-93ee-98821d74f7a1 | SLC5 CERN Server - x86_64 [2014-01-30]    | ACTIVE |        |
| 63cc5d34-b892-4801-81bf-56c66ff38000 | SLC5 CERN Server - x86_64 [2014-08-05]    | ACTIVE |        |
| cd233204-96d2-41a4-ab2f-09d3b1954404 | SLC5 Server - i386 [130624]               | ACTIVE |        |
| a27962b7-e44e-4363-970b-fd4f8ec1eec5 | SLC5 Server - i386 [130920]               | ACTIVE |        |
| d1285114-9c39-467f-8d6b-487b10fbaf90 | SLC5 Server - i386 [2014-01-30]           | ACTIVE |        |
| e32bed58-b2b2-4a6d-b9ba-7e9db2e3e5a6 | SLC5 Server - i386 [2014-08-05]           | ACTIVE |        |
| 690be388-2e8e-4498-9c1f-7c4eac862260 | SLC5 Server - x86_64 [130624]             | ACTIVE |        |
| 41992b34-19e9-4ea9-ad30-177233795732 | SLC5 Server - x86_64 [130920]             | ACTIVE |        |
| 0d2c81c6-488d-42e6-8d30-8bcc5cdffa58 | SLC5 Server - x86_64 [2014-01-30]         | ACTIVE |        |
| ccb6749f-f740-4432-85d9-65e7857ed7c7 | SLC5 Server - x86_64 [2014-08-05]         | ACTIVE |        |
| 764434ef-47a9-4345-befb-2b0479a346c5 | SLC6 CERN Server - i386 [130920]          | ACTIVE |        |
| 4d9a71b8-92e4-446e-9939-21f3a7e99211 | SLC6 CERN Server - i686 [2014-01-30]      | ACTIVE |        |
| 5b957b5b-b220-426b-b217-eb50d9f472ad | SLC6 CERN Server - i686 [2014-08-05]      | ACTIVE |        |
| 2171bb6e-6404-44e9-8cbd-8c6f6bacce1c | SLC6 CERN Server - x86_64 [130920]        | ACTIVE |        |
| 98686db8-834d-4cf5-bfe3-4bc09513682a | SLC6 CERN Server - x86_64 [2014-01-30]    | ACTIVE |        |
| 13ec4721-3f9b-4480-a29e-ccfd897120d7 | SLC6 CERN Server - x86_64 [2014-08-05]    | ACTIVE |        |
| 49e166bb-68e1-4969-b26a-64023e87ef28 | SLC6 Server - i386 [130624]               | ACTIVE |        |
| eac5a399-d1c5-43a4-928f-3bbbba7f7cf7 | SLC6 Server - i386 [130920]               | ACTIVE |        |
| ab2fd0fa-ae7b-4a29-a9fa-57c5c5baf6da | SLC6 Server - i686 [2014-01-30]           | ACTIVE |        |
| e05c34f7-afcc-4c69-985e-6d9c75011723 | SLC6 Server - i686 [2014-08-05]           | ACTIVE |        |
| b8018173-fdfc-442c-9337-612fc702652a | SLC6 Server - x86_64 [130624]             | ACTIVE |        |
| 78deafa9-93a7-41d9-9afb-8c62e29e4259 | SLC6 Server - x86_64 [130920]             | ACTIVE |        |
| 321b8583-967f-4f56-913e-2a10e058ff37 | SLC6 Server - x86_64 [2014-01-30]         | ACTIVE |        |
| d1cb4dce-7a03-4342-a6c1-9677ecb8770d | SLC6 Server - x86_64 [2014-08-05]         | ACTIVE |        |
| 5514d635-22f8-4cc8-8550-4d831920a6d4 | Ubuntu 13.10 cloud image                  | ACTIVE |        |
| 4717a8fa-6980-4b33-b27d-1526db467749 | Windows 7 - x64 [130924]                  | ACTIVE |        |
| b51918ba-8bf7-421e-a1a6-cee78928cbc9 | Windows 7 - x64 [131213]                  | ACTIVE |        |                                                                                                                                                                         
| e9d9fe68-a977-470a-bb25-e18f7e7222ca | Windows 7 - x64 [2014-06-23]              | ACTIVE |        |                                                                                                                                                                         
| dac59475-e195-4e5a-b962-f599d45c893f | Windows 8.1 - x64 [2014-04-17] (Pilot)    | ACTIVE |        |                                                                                                                                                                         
| 091a87b6-5882-42cf-9de3-d049281b51e8 | Windows Server 2008 R2 - x64 [130904]     | ACTIVE |        |                                                                                                                                                                         
| 6be8397d-264f-4804-a7a9-e83488f6ee9a | Windows Server 2008 R2 - x64 [140116]     | ACTIVE |        |                                                                                                                                                                         
| 569370f9-c915-4a74-822b-46be7e3330c3 | Windows Server 2008 R2 - x64 [2014-04-17] | ACTIVE |        |                                                                                                                                                                         
| ea4179a9-cc5f-40ce-b700-92e1fee13a44 | Windows Server 2012 R2 - x64 [2014-01-29] | ACTIVE |        |                                                                                                                                                                         
| a5758c5d-8487-4835-a47b-535cd5a0d815 | Windows Server 2012 R2 - x64 [2014-04-17] | ACTIVE |        |                                                                                                                                                                         
+--------------------------------------+-------------------------------------------+--------+--------+ 

aiadm $ ai-bs-vm -i "SLC6 Server - x86_64 [2014-08-05]" --nova-flavor=hep2.8 --foreman-environment=inspire_devel -g "inspire/wn" --landb-responsible=inspire-admin --landb-mainuser=inspire-admin inspirevm123

Note that hep2.8 is currently the largest available flavor. The machine name is the last parameter. Note: do not use SLC6 CERN Server images. These will conflict with puppet.

How to properly upgrade a machine

Typically machines are automatically updated thanks to distro-sync: http://information-technology.web.cern.ch/book/cern-configuration-management-system-user-guide/faq/upgrade-hostgroup-latest-existing-os Sometimes however the system fails to update packages: this is due to yum not able to upgrade the current kernel properly: https://cern.service-now.com/service-portal/article.do?n=KB0001959&s=yum%20kernel You can recognize this by the message:

* ********************************************************************
* Welcome to p05153026581150.cern.ch, SLC, 6.5
* Archive of news is available in /etc/motd-archive
* Reminder: You have agreed to comply with the CERN computing rules
* http://cern.ch/ComputingRules
* Puppet environment: production
* Puppet hostgroup: inspire/wn
* Node alarmed with LAS? true
* Please set a host or hostgroup parameter 'comment' to describe your host or hostgroup.
* * WARNING, p05153026581150.cern.ch has lemon exceptions:
*  exception.Operating_System
*  exception.YUM_error

* ********************************************************************

For this the easiest thing is to remove all the old kernels, e.g.:

[p05153026485494] /afs/cern.ch/user/s/skaplun > rpm -qa | grep kernel
kernel-2.6.32-431.el6.x86_64
kernel-module-openafs-2.6.32-431.el6-1.6.5-cern1.2.slc6.x86_64
kernel-debug-2.6.32-358.23.2.el6.x86_64
yum-kernel-module-1-5.slc6.cern.noarch
kernel-2.6.32-431.1.2.el6.x86_64
kernel-module-openafs-2.6.32-431.1.2.el6-1.6.5-cern1.2.slc6.x86_64
kernel-debug-2.6.32-431.1.2.el6.x86_64                                                                                                                                                                                                                                         
libreport-plugin-kerneloops-2.0.9-19.el6.x86_64                                                                                                                                                                                                                                
kernel-firmware-2.6.32-431.1.2.el6.noarch                                                                                                                                                                                                                                      
abrt-addon-kerneloops-2.0.8-21.slc6.x86_64                                                                                                                                                                                                                                     
dracut-kernel-004-336.el6_5.2.noarch                                                                                                                                                                                                                                           
kernel-module-openafs-2.6.32-358.18.1.el6-1.6.5-cern1.2.slc6.x86_64                                                                                                                                                                                                            
kernel-module-openafs-2.6.32-358.23.2.el6-1.6.5-cern1.2.slc6.x86_64                                                                                                                                                                                                            
kernel-headers-2.6.32-431.1.2.el6.x86_64                                                                                                                                                                                                                                       
kernel-2.6.32-358.18.1.el6.x86_64                                                                                                                                                                                                                                              
[p05153026485494] /afs/cern.ch/user/s/skaplun > rpm -e kernel-2.6.32-358.18.1.el6.x86_64 kernel-module-openafs-2.6.32-358.23.2.el6-1.6.5-cern1.2.slc6.x86_64 kernel-module-openafs-2.6.32-358.18.1.el6-1.6.5-cern1.2.slc6.x86_64 kernel-debug-2.6.32-358.23.2.el6.x86_64

You can then proceed with a nice sudo yum update.

Resize disk of newly created machines

Just follow http://information-technology.web.cern.ch/book/cern-cloud-infrastructure-user-guide/administering-vms/resizing-disks

How to properly reboot a machine

In case a reboot of a machine is necessary, these steps need to be accomplished:

  • disable alarms via roger:
[aiadm045] /afs/cern.ch/user/s/skaplun > roger update p05153026485494 --all_alarms false
  • Log into the machine as root (e.g. from LXPlus) so that your user is not blocking AFS
  • in case the machine is running bibsched, bibsched halt to correctly halt bibsched and then the tasks should end.
  • in case the machine is a WN properly disable it from haproxy with: fab disable:inspire05 from your inspire-script folder.
  • in case the machine is running solr... too bad! ( Note: inspire05 has /etc/rc.d/init.d/solr which takes care of shutdown and restart)
  • in case the machine is running redis... need to verify
  • in case the machine is running MySQL master: amend all WNs to switch to read only mode and make the master node to be the slave. Disable slave replication on the slave. Disable bibsched.
  • in case the machine is running MySQL slave: disable slave replication and amend all WNs to not point to slave.
  • shutdown -r +5
  • check via foreman console for proper reboot
  • restart affected services (in particular solr via:
preferred method: via init script
   $ sudo /etc/rc.d/init.d/solr   start|stop
or
   $ sudo /sbin/service solr  start|stop

directly via startup script in the install directory

   $ screen -DR
   screen $ cd /opt/cds-invenio/lib/apache-solr-3.1.0/example
   screen $ sudo java -jar start.jar &
   screen $ exit

Note:

     inspire05 has /etc/rc.d/init.d/solr which takes care of shutdown and restart
     $ chkconfig --list solr
     solr            0:off   1:off   2:off   3:on    4:off   5:on    6:off
  • reattach WNs to haproxy via: fab enable:inspire05
  • re-enable alarms: aidadm $ roger update p05153026485494 --all_alarms true
  • restart when necessary bibsched.
-- KaplunSamuele - 08 Apr 2014
Edit | Attach | Watch | Print version | History: r9 < r8 < r7 < r6 < r5 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r9 - 2015-11-16 - KaplunSamuele
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Inspire All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback