LHC Data sources
LHC machine data is normally available through:
- the CERN Accelerator Logging Service that can be queried with the Timber application (provided Java 1.8 and jws are installed);
- the NFS file system available from the technical network (e.g.
ssh cs-ccr-dev1
) in the directory /user/slops/data/LHC_DATA/OP_DATA
;
- in the Post Mortem System database accessible through its java API and other dedicated tools.
Here we describe alternative and/or complementary sources and methods to obtain data:
Eos space
40 TB have been granted in
the
eosuser
EOS instance on the space
/eos/project/a/abpdata
.
Data can be accessed from regular lxplus machines (e.g.
ssh lxplus
)
in
/eos/project/a/abpdata
or
using the
eos
commands =ls, cp, ... = to operate on the files e.g. :
eos ls /eos/project/a/abpdata
or by mounting it as a filesystem in a local directory e.g:
mkdir eos; eos fuse mount ./eos
.
Eos data can be also accessed with
xrootd
library, e.g.:
mkdir eos; xrootdfs ./eos -o rdr=root://eosuser.cern.ch//eos
Eos
abpdata
space can be also accessed via web by
http://cern.ch/abpdata
Castor
A subset of historical LHC data is stored in
/castor/cern.ch/user/r/rdemaria/data/
/castor/cern.ch/user/r/rdemaria/lhcfilldata/data
PyTimber and PageStore software
PyTimber is a Python library that allows to access the data stored in the
CERN logging data.
PageStore is a Python library that allows to store named and indexed records on disk for fast bulk read.
The software is installed in the
Jupyter notebook server and
lxplus.
In order to use them one needs a JDK >= 1.8. For windows, be sure to install the 64bit version if running on a 64bit windows installation.
In case of a non-standard location for the java library set the path of
libjvm.so
in JAVA_JVM_LIB, e.g.:
export JAVA_JVM_LIB="/cvmfs/sft.cern.ch/lcg/releases/java/8u91-ae32f/x86_64-slc6-gcc49-dbg/jre/lib/amd64/server/libjvm.so"
or the MacOS installation
/usr/libexec/java_home -V
you should see sth. like:
1.8.0_102, x86_64: "Java SE 8" /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home
1.7.0_79, x86_64: "Java SE 7" /Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/Home
choose the jdk you just installed, here:
export JAVA_HOME=`/usr/libexec/java_home -v 1.8.0_102`
running java -version should now display:
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
In alternative this variable (specific of cmmnbuild-dep-manager) can also be set:
export JAVA_JVM_LIB="/Library/Java/JavaVirtualMachines/jdk1.8.0_92.jdk/Contents/Home/jre/lib/server/libjvm.dylib"
Then install the following library on top of a recent python installation (e.g.
anaconda):
In Linux:
pip install --upgrade jpype1
pip install --upgrade git+https://gitlab.cern.ch/scripting-tools/cmmnbuild-dep-manager.git
pip install --upgrade git+https://github.com/rdemaria/pytimber.git
pip install --upgrade git+https://github.com/rdemaria/pagestore.git
In Windows is more complicated since
jpype
cannot be installed smoothly with pip:
- install the latest anaconda distribution with Python 3.5.
- download and copy the file:
JPype1-0.6.1-cp35-none-win32.whl
or JPype1-0.6.1-cp35-none-win_amd64.whl
depending on the architecture from
http://www.lfd.uci.edu/~gohlke/pythonlibs/#jpype into
C:\Users\username\
.
- open the anaconda prompt and type:
conda install git
conda install --upgrade numpy
conda install --upgrade matplotlib
pip install JPype1-0.6.1-cp35-none-win_amd64.whl
pip install --upgrade git+https://gitlab.cern.ch/scripting-tools/cmmnbuild-dep-manager.git
pip install --upgrade git+https://github.com/rdemaria/pytimber.git
pip install --upgrade git+https://github.com/rdemaria/pagestore.git
.
A subset if historical LHC data is stored in EOS the location /eos/project/abpdata using PageStore and in Castor.
A centralized installation in
lxplus
is available by using
/afs/cern.ch/work/r/rdemaria/public/anaconda/bin/ipython
Accessing LHC database
Log in
lxplus-cb6 to access the database in the EOS space or any lxplus to just use PyTimber and PageStore locally.
To start an terminal ipython prompt use
/afs/cern.ch/work/r/rdemaria/public/anaconda/bin/ipython
In order to use PageStore on EOS, type in the python prompt
!cp /eos/project/abpdata/lhc/lhc.db .
import pagestore
db=pagestore.PageStore('lhc.db','/eos/project/abpdata/lhc/datadb/')
Now it is possible to query the database similarly to PyTimber, e.g.:
#obtain old BBQ raw data
db.search('%BBQ%')
t1,t2=db.get_lim('LHC.BQBBQ.CONTINUOUS_HS.B1:ACQ_DATA_H')
t1,t2=db.get_lim('LHC.BQBBQ.CONTINUOUS.B1:ACQ_DATA_H')
print pagestore.dumpdate(t1)
print pagestore.dumpdate(t2)
data=db.get('LHC.BQBBQ.CONTINUOUS.B1:ACQ_DATA_H',t1,t1+100)
The library and the data can be also accessed using the
SWAN service.
Swan: Centralized Jupyter (beta) server
Swan provide a Jupyter service accessible with CERN credentials.
The server can be accessed from
swan.cern.ch.
When accessing the website, click on "Start My server" and set
/eos/project/abpdata/setup_swan.sh
as setup script in the form.
From outside CERN network, one can set-up an SSH tunnel:
ssh -D6789 lxplus.cern.ch
and set a socket proxy in the browser, e.g. in Firefox:
Preferences -> Advanced -> Network -> Settings... -> Manual proxy configuration --> set SOCKS Host: to 127.0.0.1 and Port: to 6789.
Examples are in:
Xrootd and eos compilation
Compilation in ubuntu needed for eos command line tool.
Compile XrootD
sudo apt-get install git cmake libzfslinux-dev uuid-dev libsparsehash-dev libreadline-dev xfslibs-dev libattr1-dev libcppunit-dev micro-httpd libmicrohttpd-dev libzmq3-dev zlib1g-dev libfuse-dev libfuse2 libkrb5-dev libcrypto++-dev
git clone https://github.com/xrootd/xrootd.git
cd xrootd
mkdir build
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=/usr
sudo make install
Get special header for zmq (bug in ubuntu distribution or eos)
cd /tmp/
git clone https://github.com/zeromq/cppzmq
sudo cp cppzmq/zmq.hpp /usr/include/
Compile eos (not really successful)
git clone https://gitlab.cern.ch/dss/eos.git
cd eos
rm utils/zmq.hpp
mkdir build
cd build
cmake ../ -DCMAKE_INSTALL_PREFIX=/opt -DCLIENT=True
Meetings and Talks
1/6/2016 HSS meeting
General introduction to the tools:
Python-based tool for LHC data mining
29/04/2016
Present: Gianni, Guido, Nicoḷ, Panagiotis, Xavier
Actions All
- test pagestore standalone and from eos. Report feedback
- collect list of interested variables to store systematically
-- Main.RiccardoDeMaria - 2016-04-30