Some semantic notations
"ioX, NbX, NoX, MoX[], MbX[], MobX[1:2,]"
A few specific prefixes are
used consistently in the names of variables to recall their meaning (at
least this is meant to be so).
- X
- replaces here an arbitrary string chosen to specify
further the variable considered
-
- refers to a dimension (single or multiple
see later).
- io
- is an ordinal number, characteristic of an order.
For example io=3 may designate the 3rd element of a list of
objects.
- Nb
- is a cardinal number, characteristic of a quantity.
For example Nb=50 may designate the quantity of litres of gasoline in the
tank of your car while io=3 (the 3rd litre) corresponds clearly
to a different concept.
- No
- is a label, describing the object considered.
For practical reasons, this variable is defined as an integer (when needed,
other variables of ASCII nature could as well be defined).
Remark:I correlate intentionally the letters o and b to
the words numerO and nomBre that, respectively,
specify ordinal and cardinal quantities in the French language. It is
unfortunate that some languages mix up the 2 concepts by using a single
word such as number or numero.
- Mo[]
- is an array of objects of type No, while the implicit
index is of type io
- Mb[]
- is similarly an array of objects of type Nb
(implicit index io)
Notice that there might be in some cases a confusion
between io and No. Each plane has been given a
name (No) through the Detector Data Base (currently the Detector.Data file),
usually increasing in each sub-detector along the direction of the beam.
The natural succession of integers (1,2,
) has been adopted in the
absence of a reason to make things complicated when they need not be. For
Drift Chambers a 2-digit code is used instead: 10*I+J where I varies from 1
to 8 for the 8 modules (the large chamber counts for 2) and J increases
naturally along the beam starting from 1. Thus No has a
definite meaning to identify the plane considered. Assume now that one
treats a set of points measured in the dEdX sub-detector along a track. A
natural running index io would run from 1 to 3, and would be
equal to No (thus they could be confused). It may however
happen that no measurement was found in plane No=2, hence io=[1,2] would
correspond to No=[1,3]. Such a confusion cannot exist for the Drift
Chambers.
- Mob[1,]
- is an array of numbers pointing to the
variable X[]
- Mob[2,]
- is an array of cardinal
numbers
Here the indices 1 and 2 are implicitly correlated
to the letters o (for numerO=ordinal)
and b (for nomBre=cardinal)
For example, let
i=MobX(1,)
n=MobX(2,)
Then
X(i) is the first element considered
X(i+n-1) is the last element considered
These Mob concepts are extensively discussed in the rest of
this note.
Now why is it useful to formalise all this ?
It became necessary to devise methods of dynamic storage, some 25 odd years
ago, when available space memory was expensive while its requested
counterpart was growing fast. HYDRA management opened the way, pushed by
the strong incentive of Bubble Chamber data.
A successor (extended) version ZEBRA followed (and other similar
versions, such as BOSS developed
in Hambourg, were developed; I do not intend to be exhaustive on this
subject). These programs took care of the data space problem at the cost
of a fairly heavy load on users that only large groups could adopt (or
rather could not avoid in order to maintain consistency between the
contributions of small groups).
FORTRAN90 has in principle adopted some
similar structural aspects. The difficulties of putting together the
contributions of small groups into ever increasing collaborations
(currently LHC) seems to make it necessary to impose more general
programming concepts, Object Oriented such as C++, and even a specially
LHC adapted species.
I am sure that we should not embark in such adventures (DIRAC is too
small
to waste time). Thus I developed a very simple implementation of
a dynamical storage that will minimise the recurrent problems of checking
all over the program for dimension overflows. In addition, grouping in
common arrays objects that are of similar nature, though related to
different sub-detectors, helps writing code that is as often as possible
detector independent. The trivial idea is based on the variable described
above, Mob[1:2,], which I try to clarify in the following.
- Dynamical storage => space saving
Space saving is not anymore a serious objective because the cost of
memory is now fairly low. However, for this very reason (!), one is tempted
to define (very) large dimensions for all variables in order to be
protected against inabilities to process some odd events.
Let Xi(Ni) [i=1,Nb] be a set of Nb variables which are used to store
some parameters characteristic of an event. X1 and X2 could represent
ADC signals from the first 2 planes of the dEdX sub-detector, definitely
concepts of a common nature. X3 might represent ADC signals from Vertical
Hodoscopes, again similar concepts though from a different sub-detector.
X4 might represent TDC information, that is certainly a different concept
(whatever the sub-detector).
Given this, one needs to foresee each of the
corresponding dimensions (here N1 to N4) to the maximum value it can reach
in any event. Assume that one has the possibility to store the relevant
event information successively into the variables X1, X2, X3 and X4.
This means completing the filling of X1 before starting that of X2, which
implies some reasonable ordering of the input data (this IS NOT necessarily
the case, it is however for DIRAC
as far as I know). The sizes of
the slots needed to store each of these variables, say J1 to J4, are by
construction upwards bounded by the corresponding dimensions, N1 to N4.
The idea is to consider rather a new variable, say X, into which one will
store successively the variables of the example above, that is X1
to X4. The overall size will be in this case [J1+J2+J3+J4] rather than
[N1+N2+N3+N4]. To be precise, X must be given a dimension in the program
that is Max [Sum {Ji}] rather than Sum [Max {Ji}], the latter being in
general smaller than the former (i.e. strictly not larger). This implies
evidently to record somewhere at which address in X[] starts the
information corresponding to Xi and how many successive words (i.e.
Ji) are allocated to it. This is where MobX(1:2,) enters into the
game: the value 1 of the first index gives the address (in X) and the
value of the second gives the size used. For example in the simple case
mentioned above Xi starts at X(MobX(1,i)) and spans MobX(2,i) objects.
Notice that it might happen that one variable needs more space than
originally foreseen, due for example to local errors on the original data.
Minor overflows (over expected maximal dimension) could still be handled
by this method.
- Commodity of space sharing
The maximum allowed number of dimensions varies with compilers, but
large values (4 IS already large) are normally avoided because they
generate catastrophic speed problems.
The dimension of X[], represented so far by the shorthand
"", may accommodate some reasonably arbitrary natures.
X(i:j), where i,j are ordered (ižj) signed integers, is 1-dimensional,
but X(1:3,-7:25), 2-dimensional, is as well valid for the discussion
considered here (and more dimensions are still acceptable). However
objects stored in series into X[] MUST share the same dimensions
after the first one (when many are used). Indeed MobX(1,i), an address,
refers implicitly to the first dimension of X[], the other dimensions
MUST be common to all "i". Otherwise MobX would have to be a more
complex object.
The first index may be implicitly matched to multiple dimensions,
which might turn out to be tricky. For example one may store a set of
points, intersections of a track with a set of sub-detectors, using
either XYZ(i) or X(i), Y(i), Z(i). MobXYZ or MobX, MobY, MobZ would give
access to these variables. The second index, j, of these Mob would
represent the sub-detector. Mob(1,j) would point into the arrays while
Mob(2,j) would indicate the number of successive information per
sub-detector. NOTICE that in the XYZ example Mob(2,j) would be a
multiple of 3 (and consequently i=Mob(1,j) would vary by multiples of 3).
Depending of the problem considered one may have to choose between the
solutions XYZ or X,Y,Z. This is a VERY simple case, that could be solved
much more easily by using XYZ(3,i) instead of XYZ(i). The real
(non trivial) interest of the method is when the 3 mentioned
above is i-dependent. For example dEdX measures a slab number and an
ADC (2 information, that may later turn out to be 3 if one observes TDC
as well), while Drift Chambers measure a wire number and a drift time.
Notice that there is a definite interest to process information, when
possible, in a unified way by storing information from different
sub-detectors into a common array (or set of) thus allowing to use
common procedures. In contrast, manipulating variables whose names are
sub-detector dependent require explicitly different code elements even in
the case where the underlying data processing could be made common (imagine
a global fit of all information on a track!). Keep in mind that duplication
of code elements, with slight modifications in the variable names, is a
nightmare for maintenance when bugs (or improvements) require an action.
A practical application
At the early stage of the program I devised a structure to store
information from Monte Carlo results together with a unified
way to represent the detector parameters I needed. This part of the code
exists on the current Pam file.
There is a part that I devised in order to have a primitive display
of the events. Pattern recognition was made in all parts of the detector
in an utterly brutal way (ALL combinations are considered) but this is
unimportant because I meant to look only at a few hundreds of events.
Track fitting is a bit smarter (though restricted to straight lines) but
done only piecewise (projections). I had thus enough information to
display events (measurements and tracks). The measurements (Monte Carlo!)
were stored into a dynamical structure (routine GetMeasFromMC) which I
later transposed into GetMeasFromData that is operational for real data
(it is an interface to the decoding routines of Valeri). This same
structure is used in the non-Mac event display developed by
J.-L. Narjoux.
The geometrical characteristics of the detector are defined in terms of
planes (Cherenkov counters are so far considered as planes at
entrance and exit windows). A set of Global planes is defined
at a fairly fundamental level (routine RefFramesDefine). A different set
concerns so-called Measurement planes (routine InitReconst) that is
currently hard coded in terms of projections (locally defined
as Classes). This part could be generalised (i.e. adjustable from outside
the program, avoiding recompilation) if felt useful. Finally there are
links between these objects and variables whose names that ultimately must
be detector dependent.
A label had been associated to each sub-detector
(the vague number English concept). In principle this is an
identifier No, but one may consider it an io
because successive integers were mostly used along the beam (1 has been
used for the target, thus the first sub-detector, MSGC, is labelled 2):
- Target
- MSGC
- Scintillating Fibres
- Ionisation Detector
- Drift Chambers
- Vertical Hodoscopes
- Horizontal Hodoscopes
- Pre Shower counters
- Muon counters
- Cherenkov
Unfortunately no information was available at that time for the
Cherenkov, hence it was forgotten in the list of labels that was defined
according to the detectors position along the beam. In addition,
other sub-detectors are already considered to be added in the future so
that one cannot rely on this implicit rule, thus you better get used to
make the difference between the two concepts io
and No.
Indices used here (MSGC is taken as an example, where the detector
name is used):
- IoPlaneGlobal = [1
NbPlanesGlobal]
- IoPlaneMeas = [1
NbPlanesMeas]
- IoPlaneDet = [1
NbPlanesMSGC]
General links:
- RefFromMeasPlanes(ioPlaneMeas) = ioPlaneGlobal
- RefToMeasPlanes(ioPlaneGlobal) = ioPlaneMeas
Sub-detector dependent (here MSGC taken as an example):
- NoDet = 2 ! [2 means here MSGC, refer to the list above]
- RefToDetPlanes(1,ioPlaneGlobal) = NoDet
- RefToDetPlanes(2,ioPlaneGlobal) = ioPlaneDet
- RefToGlobalFromMSGC(ioPlaneDet) = ioPlaneGlobal