Les Robertson

10 November 1999

Notes on the ST-IT coordination meeting - 2 November 1999

 

Present: Anne Funken, Les Robertson, Tim Smith, Dave Underhill, Mario Vergari

1. Background

IT Division has requested assistance from ST in the long term planning of the infrastructure for LHC offline computing, in particular the requirements for power and cooling of the computing fabrics that will be installed in building 513 (Computer Centre). ST Division has appointed Anne Funken (ST/EL) as the project leader responsible for coordination of medium and long-term evolution of the technical infrastructure of B.513. This meeting was the first of a series of regular meetings to establish the infrastructure requirements for LHC computing, provide initial cost estimates, and make a long term plan taking account of the medium term requirements and evolution.

2. People attending the meeting

Anne Funken - ST/EL - ST coordinator for B.513
Tim Smith - IT/PDP - planning of LHC physics computing facilities
Dave Underhill - IT/CIO - group leader, central infrastructure & operations
Mario Vergari - IT/PDP - operations manager of physics data processing services
Les Robertson - IT deputy division leader

3. Scope & relationship to other activities

It was agreed that the present series of meetings will cover the following areas.

    1. LHC Planning: space, power, cooling, fire detection.
    2. Short-medium term planning for power, and cooling.
      This must evolve as part of the longer term plan for LHC.
    3. Technical infrastructure of the planned annexe to building 186.
      This annexe is currently being planned to provide a second location for installation of magnetic tape handling robots, together with the associated computer servers. IT's contacts have been with the EP/SMI group (the annexe is funded by EP as it replaces IT space in the basement of B.186 which is being transformed into a facility for handling silicon strip detectors).
    4. Evolution of current ST support for power distribution in B.513, and support for the current UPS.
    5. Not included in the scope of these meetings are:

    6. Smoke detection and safety in B.513. This is the subject of another series of meetings with TIS, attended by Dave and Anne.
    7. Replacement of the access control system for B.513, B.31 and B.600.


4. Discussion of issues to be examined.

Dave had provided a list of discussion points (attached). The conclusions and actions are listed below.

Power

An initial estimate of power requirements for CMS in 2006 has been produced by Les (see table). It is assumed that the total requirement in 2006 is 5 times that, to cover ATLAS, ALICE, LHCB, other experiments and neutrinos, giving a total sustained load of 2 MW.
The estimates assume that the base power consumption of a small system box with two disks will be similar in 2006 to current examples. Measurements of the constant and peak (start-up) power consumption of current PCs should be made (action: Mario). Current processor power consumption is about 25-30 Watts per packaged chip. This is likely to increase slightly.

 CMS farm 2006 - physical space & power

processors

tape

4

cpus/box (400 SI95/box)

20

drives per stack

1

sub-farm per rack

5

stacks per farm

5

sq.m. floor area per rack

2

sq.m. per stack

40

sub-farms

10

sq.m. per farm

200

sq.m. per farm

75

Watts per drive

175

Watts per box

8

KWatts per farm

1'400

boxes

100

GB per cartridge

245

KWatts per farm

2

PB per farm

disks

20'000

cartridges per farm

16

disks/shelf (1.6TB/shelf)

6'000

cartridges per silo

1

disk shelf/array

4

silos per farm

1

shelf/controller pair

120

sq.m. library + drives per farm

2

shelf slots per array

1.5

KWatts per silo

9

arrays per 19" rack

6

KWatts per library

14'400

GB per rack

14

KWatts library + drives per farm

340

arrays per farm

38

racks per farm

totals

1.1

sq.m. per rack

400

KWatts power

50

sq.m. per farm

370

sq.m. floorspace

300

Watts per disk tray

100

Watts per controller pair

400

Watts per array

140

KWatts per farm

 

The space requirements are such that it will be necessary to install equipment in the tape vault and the former motor generator room in the basement of B.513.

Tim will make an estimate of the evolution for the pre-LHC period (next four years). (Action: Tim).

Tim will also investigate the possibilities for central control of the power at the rack and server level. (Action Tim).

An uninterruptable power supply (UPS) is required to cover short power outages. For longer term interruptions (> 15 minutes) an alternative power supply is required, such as diesel powered generators. ST Division is at present reviewing the future of the Meyrin site stand-by generators, expecting to reach a conclusion in the next few months.

Dave will verify the warranty conditions for the current UPS, and check the anticipated battery lifetime. (Action Dave).

A secondary power supply is at present available which is used during the annual emergency power off exercise.


ST will review the complete power supply and distribution situation in B.513, and make proposals and cost estimates for evolving this to satisfy the medium and long-term requirements. (Action: Anne).

Cooling

ST will review the cooling situation in B.513, and make proposals and cost estimates for evolving this to satisfy the medium and long-term requirements. (Action: Anne).

Smoke detection

There is a working group with representatives of ST, TIS and IT looking into the smoke detection situation in the computer room. This should cover also the areas in the basement in which it is intended to install computing equipment.

The strategy for system packaging must be addressed. Currently, systems are stacked in open racks, and the smoke detection is performed at the level of the room. This makes it difficult/impossible to consider automatically cutting power. Compartmentalisation should be studied (such as packaging systems in closed racks with local smoke detection). The strategy adopted by other major computer centres should also be studied.

(Action: Dave - report on smoke detection working group; organise visits to other computer centres; Mario - investigate closed rack solutions).

Building 186 Annexe

{The following information was established after the meeting.} The status of this annexe is that a pre-study has been carried out by EP/SMI group (Alan Ball) in order to establish cost estimates, and budget approval within EP Division is being sought. The project has not yet been discussed with ST.

5. Next Meeting

The next meeting will take place on 1 February 2000 at 10:00 in B.513-2.023


 

 

A few thoughts on the issue of infrastructure for building 513

Dave Underhill

Power

What will be the power requirement?
What infrastructure would need changing?
Do we need to replace what we have or can we build on top of the existing equipment?
How to distribute power within the building?
Current power distribution does not give flexibility.
Must know from where each piece of equipment is powered.
What its requirements are and how it can be switched off/on.
Should be able to control power to equipment remotely and under process control.
Can we profit from equipment with dual power connections?
CERN backup supply via diesels?
Secondary source for EPO testing?
What are the power needs for cooling?

UPS

What will be the need?
Does all the equipment need protection? I would presume yes.
Should it protect the cooling as well?
How long do we need to maintain power?
Enough to overcome power spikes and variations.
Enough to allow CERN backup service (diesels?) to take the load.
Enough to be able to shutdown equipment.
Could be fast if automated from process control or remote.
Very slow if manual.
Current batteries need replacing soon.
Currently 3 * 400kva modules with fourth on site but without batteries.

Cooling

Currently cool air forced down from the ceiling with free flow throughout the room.
Free and mixed cooling from Autumn to Spring.
Should we go for closed in racks with built in cooling?
Which is more efficient?

Smoke

Currently detected as air flows through tubes spread across the false ceiling.
Detects smoke in the room but not necessarily from where in the room.
Just one monitor at present, although we have offer for a second.
Smoke detection in room would imply EPO of the room
Should we go for localised smoke detection equipment?
Incorporated in closed equipment racks?
Proximity detectors like those for the STK silos?
Should detection invoke automatic action?
Rundown / power off / extinction
The Pompiers are just 3 minutes away.

Racking

Current racking designed for handling equipment from floor level.
But it’s a high room so why not go higher and access from steps?
How often will installed equipment need to be accessed manually?
Build a gallery with walkway?
How about cabling and are cable lengths an issue?
Will the false floor support it?