This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern/ for current CERN information.
|
Previous: | Y2K Testing Continues in the Autumn | (See printing version) | |
Next: | Changes to Interactive and Batch Registration |
hsm
CommandHarry Renshall , IT/PDP
At the FOCUS meeting of July 1st 1999, Judy Richards
presented an analysis of current file management and personal stage
requirements (see
http://www.cern.ch/CERN/meetings/FOCUS/Focus14/
). She
discussed Hierarchical Storage Management (HSM), in
which files are moved from a disk pool to a tape pool on a
'least recently used' basis. This is currently implemented at
CERN under an IBM service offering called HPSS, the High Performance
Storage System, but there is concern over management and
administration costs for this application. Use is therefore limited
to modest amounts of data and straightforward applications such as
Central Data Recording and large private user files. Access is via
CERN interfaces so that user data could be migrated transparently if
the underlying storage management were to be changed.
The main interfaces up to now have been the CERN staging system
commands such as stagein, stageout, stagewrt
and the
remote file copy command rfcp
. These will both stay in
use but we have now packaged rfcp
into a new command,
called hsm
, which hides the fact that the
underlying storage manager is HPSS and will hence allow any future
migration to another storage manager. The stage commands copy data
between tape staging pools and the HSM filebase. The new
hsm
command will copy data between local workstation or PC
disks and the HSM filebase. Both sets of commands work on any files
stored in the HSM filebase.
The basic functions of the hsm
command approved by
FOCUS are:
The HSM service hence appears to users as a large back-end file
store in which they have a personal home directory, currently of
the form
/hpss/cern.ch/user/firstletterofloginid/loginid
, into which
they can put local files (via hsm put
), get them back
(via hsm get
), audit their stored files (via hsm
ls
) , delete stored files (via hsm delete
) and
create directory structures (via hsm mkdir
). The
hsm
command defaults to the home directory of the
logged-in user. It is automatically available in the CERN ASIS tree
and has a full man page (i.e. type "man hsm
"). We have
sized the resources in the back-end store assuming from 500 to 1000
users, each with from 1 to 10 GB stored. Users have to be registered
in the CERN Computer Users data base by their group administrator
to use the HSM. Since the underlying software uses the account
registry called DCE we have introduced a DCE service (at the same
level as AFS or TAPES) into the XUSERREG
command and
group administrators should register the requested UNIX loginid in
this service.
The HSM service is intended for users with disk space
requirements beyond those of AFS home directory or user project
space, though performance reasons can also be important. Our
currently policy is to allow user AFS home directories to grow up
to 300 MB of quota of what are typically small files (a few KB to several
MB) and this is well backed up expensive disk space. There is an
archive facility, called pubarch
, which matches this
type of data.
We have recently introduced an extension of AFS project space to
individual users using cheaper but backed up disks for larger
files, tens of MB, with a quota of hundreds of MB to a few GB. This
space is suitable for interactive work with files up to a few tens
of MB, as it uses the AFS caching mechanism, or for batch work on
larger files where response is not critical but it is not suitable
for high data rate work with larger files from tens of MB up to 2
GB. For this type of data we recommend using the hsm
command to copy from the HSM to the local workstation disk and working
with the local copy, which can be moved back to the HSM if modified.
Note that to the hsm
command AFS directories appear to
be local disks. However, given the architecture of AFS it does not
make sense to use the HSM filebase as an active back-end for AFS
project space and users should rather try to obtain more project
space. It is, however, a reasonable use of the HSM to use it as an
archive from AFS project space for data that is unlikely to be
reused, since the permanent store of the HSM is cheap tape rather
than expensive disk.
We will be happy to discuss with group space administrators or individual users the appropriate storage to use in case of doubt.
For matters related to this article please contact the author.
Cnl.Editor@cern.ch