CERN Accelerating science

This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern/ for current CERN information.

CERN home pageCERN home pageThe Consult pageThe Consult pageThis pageThe Consult pageThis pageThis pageHelp, Info about this page
Previous:SHIFT Software Evolution: CASTOR(See printing version)
Next:Desktop Computing


Second Tape Library for CERN

Charles Curran , IT/PDP


Disaster Avoidance and Recovery for CERN Data

Almost all of CERN's active data is now stored in one large STK automated library in the Computer Centre (building 513). Concentrated into this small area are the results of many years of costly work, so the consequences of an accident or malicious act involving this library could be very severe. Proposals for disaster avoidance and disaster recovery for this data are now being implemented.

Damage or loss of data can be caused maliciously or accidentally as a result of 'normal' risks. Basic measures have been in place for some time to reduce some of the obvious of these. The Tape Management System (TMS) controls access to tapes as far as writing is concerned, and physical access to the Computer Room and the Tape Vault is also controlled. Individual tapes always provide a 'switch' to prevent writing, which can be set on user request, although a manual intervention is required. Tapes containing migrated 3480, 3490 and 3590 data are for example given this extra protection against accidental overwriting. Accidental damage to data may occur on a large scale as a result of 'normal' accidents. Obvious examples of these are damage due to fire, smoke particles or flooding.

An effective method of reducing exposure to such accidental data loss is the division of important data between two or more physically separate libraries. Most accidents or extensive periods of 'down time' can be expected to affect just one of the libraries: user can continue to work with the other, with 50% or more of their data accessible, depending on their organisation of its distribution between the libraries. This organisation can be done reasonably effectively in a 'passive' manner, by for example placing every 'even numbered' volume in one library and every 'odd numbered' cartridge in another. On average, half of any given set of data will be in one library and half in the other (supposing it occupies several physical volumes). Alternatively, data might be split between the two libraries on run criteria, if a user wished to do this.

Users or experiments can do even better by explicitly making copies of vital data, locating the active copies in one library and the 'backup copies' in the other. At the expense of using more cartridges, all such vital data would be present in both libraries. Products such as HPSS also allow multiple copies of data, and these will be split between the physical libraries.

IT will begin to provide such a dual library facility in December 1999. Building 186, which until now offered a physically separate archive, has been cleared for a new EP facility. Plans are however being made for an annex to building 186, which will support this physically separate robot installation. This work will hopefully be completed during 2000. In the short term, the second library will begin to be installed in the Computer Centre Tape Vault. Of course, once the preliminary installation has been tested, we will start to move user cartridges to the new library.

Default moves of cartridges

Users or experiments who would prefer a move tailored to their specific needs or requirements are requested to contact Tape.Support@cern.ch. By default, we propose to:

  1. Split 9840 volumes equally between the old and new libraries, 'even' VIDs such as 'R01002' to one and 'odd' VIDs such as 'R01003' to the other.
  2. Split 10 GB Redwood volumes equally between the old and new libraries in a similar way.
  3. Split 25 GB Redwood volumes equally between the old and new libraries in a similar way.

Currently this implies the ejection and re-insertion in the new library of about 300 9840s, 1200 10 GB Redwoods and 3300 25 GB Redwoods. This operation will be done over several days in small batches of cartridges, and should be transparent to the users. The more numerous 50 GB Redwoods (about 11000 in total) will initially remain in the existing library, while we discuss their requirements with the experiments that own them.

You may see the proposed configuration at the end of 1999/early 2000 below. Note that the 'Vault' configuration shows how the two Powderhorns will eventually be expanded to four in the 'remote' location, and possibly to six. The 'centre' configuration shows how it could also be expanded to six Powderhorns. The Web pages at:

http://wwwinfo.cern.ch/pdp/vm/guide/avoidance.html

show all the moves of equipment that are currently planned to reach the final configurations of the two libraries, each of four Powderhorn silos (optionally six).


Arrangement of the StorageTek Silo systems
Bldg. 513 - Centre
Bldg. 513 - Vault
December 1999/January 2000

Users should note that our strong dependence on a single media type is also being reduced. The STK 9840 service is now used extensively for storing data other than 'bulk raw data'. It offers a choice of media to the user, although more costly per GByte in terms of media, and provides much faster access to files on tape than other media we currently offer. The low cost of these units (compared with Redwood) makes it an affordable device for installation by CERN's collaborating laboratories and institutes. You will see also above that we intend to install IBM units in the silos. These will initially be our existing 3590 units, which can be attached to STK silos using IBM's 'C12' drive frame. Upgrades of these units to the 3590E model are under active consideration. The upgrade doubles the track density on the 'J' media, enabling it (if rewritten) to store a nominal 20 GB of data rather than 10 GB. Data rate also increases from ~9 MBytes/s to ~14 MBytes/s for read and write.

We will choose a suitable moment to move or reconfigure the 4 HPSS Redwood units, which should also be split between the two libraries. However, it is possible that these Redwoods will in fact be released during 2000, replaced by 9840s or 3590Es.



For matters related to this article please contact the author.

Cnl.Editor@cern.ch



Last Updated on Thu Dec 16 13:32:55 GMT+03:30 1999.
Copyright © CERN 1999 -- European Laboratory for Particle Physics