This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern/ for current CERN information.
The replacement for the public CERNVM Batch services is provided on the CERNSP service using the IBM Loadleveler product. There is a chapter on this in the CERNSP Introductory User Guide available from the UCO or at http://consult.cern.ch/writeups/cernspintro. We will be making three significant changes to this environment shortly after the appearance of this CNL.
Change 1 --
To submit a file as a batch job the command llsubmit filename is used.
Loadleveler
will assign a unique job identifier to the resulting job
of the form sp008.1234.0
. This identifier is TODAY displayed under the
Id
column by the llq command to show the current job queue and it can be
used as argument to various Loadleveler
commands, such as the
llcancel
command to cancel a job.
Jobs are also assigned jobnames,
either through a Loadleveler
statement
in the submitted script (the job_name parameter),
or defaulted by the llsubmit
command. llsubmit uses the string it
finds in the $HOME/.lljobcount
file to build a job name. It
looks for a 3-digit integer at the end of this string and adds one
then saves it back. If their is no such integer it will create 001
.
If the file does not exist it will create it to contain uuu001
where uuu
is the first part of the unix account field before the
\$
sign (you can see yours via
'grep your-loginid /etc/account').
You can edit the .lljobcount
file at any time.
The first change users will see is that llq
will by default display
the jobname instead of the jobid. Of course, Loadleveler
commands
which use jobid can also take jobname as argument. And of course,
jobid will still be available with a special option (-r jobid) to
the llq command.
Change 2 --
Loadleveller currently creates a new directory in the directory of the submitted file plus the two job files known as standard error and standard out. Hence the command:
llsubmit mydir/myjob
resulting in job sp008.1234.0
will make
Inside the ``return'' directory Loadleveler
puts at job end
time all the files that the
batch job leaves in its \$WORK
scratch directory (there are parameters
to control this). We are going to change our system so that the .err
and .out
files are written into the local \$WORK
directory and then
at jobend time, copied back to the ``return'' directory.
This way,
Change 3 --
Currently users can login to batch nodes to inspect the progress of batch jobs though this is only tolerated if the jobs are misbehaving. We have now introduced a set of commands which will allow remote inspection of running batch jobs so will be disallowing login to the batch and parallel nodes. The commands all begin ll and you can see a summary of them by typing man batch (from an SP2 node) and details on individual commands through man commandname as usual.