Processing of the WLCG monitoring data using Hadoop.
The Worldwide LCG Computing Grid (WLCG) today includes more than 170 computing centres where more than 2 million jobs are being executed daily and petabytes of data are transferred between sites. Monitoring the computing activities of the LHC experiments, over such a huge heterogeneous infrastructure, is extremely demanding in terms of computation , performance and reliability. Furthermore, the generated monitoring flow is constantly increasing , which represents another challenge for the monitoring systems. While existing solutions are traditionally based on ORACLE for data storage and processing, recent developments evaluate Hadoop/MapReduce for processing large-scale monitoring datasets. Hadoop, an open source implementation of the map/reduce paradigm, is an increasingly popular framework for processing datasets at the terabyte and petabyte scale using commodity hardware. In this contribution, we describe the integration of Hadoop/MapReduce data processing in the Experiment Dashboard framework and the first experience of using this technology for monitoring the LHC computing activities.
- Track: Distributed Processing and Data Handling
- Co-Authors from IT-ES: E. Karavakis, A. Beche, J. Andreeva, P. Saiz, D. Tuckett, J. Schovancova, I. Dzhunov
- Presentation Type: Parallel
- Author list: E. Karavakis, A. Beche, J. Andreeva, P. Saiz, D. Tuckett, J. Schovancova, I. Dzhunov, I. Kadochnikov, S.Belov