|
Contents
|
ALICE Data Challenge 2000Charles Curran , IT/PDP AbstractALICE, in collaboration with IT, will attempt to simulate their data processing chain chain from the DAQ equipment to the central tape libaries, running at ~100 MBytes/s for a period of a week. Both CERN's CASTOR software and HPSS will be tested. This is an important step towards their final aim of reliable data recording at ~1000 MBytes/s, which they will need during LHC heavy ion runs. Some time ago the ALICE collaboration decided to start a project
in collaboration with IT to try to use a commercial HSM as a means
of storing experimental data at rates approaching those that they
will require when LHC starts up. While most LHC experiments demand
only a 'modest' data rate of ~ 100 MBytes/s, ALICE will require to
sustain a rate of ~1000 MBytes/s during heavy ion runs. At that
time, the 'commercial' candidate was the High Performance Storage
System, HPSS, which was (and still is) an IBM-led consortium. You
can see more about HPSS
at URL
The initial aim of the project was to sustain ~100 MBytes/s for a period of at least a week, using the entire chain from ALICE's DAQ system down to the central robotic tape library. A first attempt was made last year to store data into HPSS at up to 40 MBytes/s. This soon revealed that we had insufficient disk to sustain such a transfer rate (normally, data sent to HPSS is first written to HPSS-owned disk, and subsequently migrated from disk to tape). It also revealed unexpected hardware problems (since resolved) in the IBM hosts when attempting to sustain a high data rate to tape for several days. Difficulties were also seen with the COMPAQ Alpha 4100 SMP hosts, where performance was lower than expected. These problems have also been corrected. Nevertheless, a fairly respectable 25 MBytes/s was sustained when the tape writing step was suppressed. This year's tests are expected to improve greatly on this rate. Since the start of this project, a development of the existing
CERN stager software has begun: CASTOR. You can see more about
CASTOR at URL: The tests started on March 23rd using HPSS, and this first phase ended on Thursday 6th April. The HPSS test took some time to arrange, as tape and disk 'movers' needed to be configured into a DCE cell, and the cartridges to be used needed to be imported into HPSS. Unfortunately, all the borrowed Redwood tape drives (16) had to come from the 24 normally available units in the Computer Room silo library, which inevitably impacted the normal user service, as the only possible hosts were COMPAQ 4100 systems installed next to this library. It was hoped to reach ~100 Mbytes/s for a 1 week period. This part of the Data Challenge ended on Thursday 6th April. Although the total data transferred into HPSS was indeed much less than was hoped for, due to unexpected software, hardware and network problems, the second phase using CASTOR should benefit from the workarounds and solutions found so far. With CASTOR, the Data Challenge again hoped to reach ~100 MBytes/s for a 1 week period. If this attempt goes well, we may try for a short period to go above 100 MBytes/s. The CASTOR test (just like the HPSS test) will use equipment normally in 'public' use; in this case 11 Redwood tape units and several hundred Redwood cartridges which will be rewritten as required. These Redwood units in this case can come from both the Computer Room and the Tape Vault silo libraries, and the normal user service will not be affected as much as in the case of the HPSS test. We will be again be using borrowed PC, disks, and ~1000 as yet unused Redwood cartridges. We are most grateful to those who have lent these resources, without which these tests would not be possible. All the tests will end before the start of CERN's accelerators for 2000. Despite our efforts to reduce the impact of these tests on 'normal use' as far as possible, users will see longer waits for data access due to the temporary reduction in unit numbers available to them. We hope that you will not experience too much inconvenience during these tests, but we feel that they are really necessary, as such sustained data rates have already shown unsuspected problems and more may be expected to show themselves in the second half of the test. The knowledge gained will help us to plan the systems that will be required when LHC starts on as realistic a foundation as possible. Another article at a later date will describe the results of this Data Challenge. If all goes well, further tests will be carried out. It is hoped that the data rate achieved might be doubled every year, thus rapidly approaching the final target of ~1000 MBytes/s. This will definitely require more equipment than we have available now, or different equipment! About the author(s): Charles works in the IT/PDP group. He is responsible for their efforts to automate and keep up-to-date CERN's central tape systems, and get them ready to support LHC's challenging data rates. |