For various reasons, CERN has decided to no longer use HPSS for
storing files on tapes, and to switch to the home-grown product
Castor (Castor fiber L.). Experiment's data already stored
in HPSS is already actively being moved to Castor
by the
experiments themselves.
The other type of data stored in HPSS so far is the so-called
"User Tapes" data, offered as a way to archive/restore
large files (like n-tuples) on tertiary storage. This is achieved
through a command called "hsm
".
Time has come ... to convert this tool to use Castor rather
than HPSS, and to help users move their data.
The obvious idea would be to change the hsm
command
to be fully transparent within the conversion period:
-
"
hsm put
" command would start putting files into
Castor instead of HPSS.
-
"
hsm get
" command would try and locate files first
in Castor, then in HPSS, so it is the Castor copy, if it exists,
which will be retrieved.
Things are already less easy with the other subcommands:
-
"
hsm ls
" or "hsm query
" should look
both in Castor & HPSS, and give some sort of composite output,
where the only HPSS files shown are those which have no counterpart
in Castor.
-
"
hsm delete
" should look in both places, and
delete all the existing copies of a file.
And the logic becomes over-convoluted for the "hsm
rename
" command. If file A exists in the HPSS tree,
and file B in Castor, should "hsm rename A B
"
fail, even if B would have been created in the HPSS
tree?
Although the design of this transparent hsm
command
has been looked at, it is felt that the coding and debugging are
not worth the required work.
For this reason, we are going to implement a simpler,
translucent, rather than transparent approach:
- A copy tool is being written (it had to be anyway), and used
either by the users themselves, or by the Computing Center on
behalf of the users.
- While the tool is running against a specific user's files, the
hsm
command is blocked for this user, with a message
asking for some patience.
- Once the copy is completed, and the files re-invigorated by a
dose of Castor oil, the
hsm
command only deals with
Castor files.
- This change of state will be controlled by a hidden
.hsmrc
file in the user's home directory, which file should
of course not be messed with.
We think that this simpler way is less likely to create
problems, and will make users more aware of the change of the
underlying Hierarchical Storage System.