Dashboard Task Monitoring abstract for EGI CF 2012

Title

User-centric monitoring of the analysis and production activities within the ATLAS and CMS Virtual Organisations using the Experiment Dashboard system

Overview

The Experiment Dashboard is a monitoring system developed for the LHC experiments in order to provide the view of the Grid infrastructure from the perspective of the Virtual Organisation (VO). It enables a transparent view of the experiment activities across different middleware implementations and combines the Grid monitoring data with information that is specific to the VO. Job processing is the core part of the VO computing activities. The scientists must be able to monitor the execution status, application and grid-level messages of their tasks that may run at any site within the VO. The Dashboard Task Monitoring applications collect and expose a user-centric set of information to the user regarding submitted tasks. They provide a clear and precise view of the status of the task including job distribution by sites and over time, reason of failure and advanced graphical plots giving a more usable and attractive interface to the analysis and production user.

Description of Work (abstract)

Various fully distributed job submission methods and execution backends are being used both within the ATLAS and CMS VOs. More than 700,000 ATLAS and 300,000 CMS jobs are submitted daily to the Worldwide LHC Computing Grid (WLCG) and are processed on different middleware platforms. The LHC job processing activity is divided in two categories: processing of large-scale Monte-Carlo production jobs and user analysis jobs. The main difference between the mentioned categories is that the former is a well-organised activity performed by a group of experts, while the latter is chaotic analysis processing by diverse members of the physics community. The behaviour of analysis jobs is particularly difficult to predict as it is normally carried out by users who do not have to be necessarily experienced in using the Grid. All the previously mentioned factors increase the complexity of the monitoring of the job processing activities within these VOs. While most of the existing monitoring applications are coupled to a specific Workload Management System (WMS), such as CRAB Monitoring for CMS and Panda Monitoring for ATLAS, the Dashboard Task Monitoring applications support different middleware implementations and job submission systems. They combine Grid monitoring data with information that is specific to the experiment by collecting information from various sources, such as the user interface of the WMS, the job submission systems, and the jobs themselves, presenting all this information in a coherent way, as if all of it came from one source. The development was user driven with physicists invited to test the prototypes in order to assemble further requirements and identify weaknesses with the applications. The talk describes the current status of the job processing monitoring, covers the Dashboard Task Monitoring applications for the analysis and the production users which are widely used by the ATLAS and CMS community, and provides an insight into future development plans.

Impact

The Dashboard Task Monitoring applications for the analysis and the production users have become very popular among the ATLAS and CMS users and play an important role in the analysis and production operations of the LHC. More than two hundred and fifty distinct analysis users are using them for their everyday work just for CMS. Close collaboration with users and production teams resulted in the tools being focused on their exact monitoring needs.

Conclusions

There was a big progress in the development of applications for monitoring of the user analysis and production activities from the year of 2009 and onwards. This work is very important, since it contributes to the overall success of the LHC offline computing. During the first year of the data taking, the Dashboard Task Monitoring applications were proven to be an essential component for the LHC computing operations. They are being developed in very close collaboration with the physicists who use the Grid infrastructure to submit analysis and production jobs. As a result, they respond well to the needs of the LHC experiments.

URL

http://dashboard.cern.ch

Presentation Type

Presentation/Paper

Track

Software services for users and communities

-- EdwardKaravakis - 17-Nov-2011

Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2011-11-17 - EdwardKaravakis
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback