CERN Accelerating science

This website is no longer maintained. Its content may be obsolete. Please visit http://home.cern/ for current CERN information.

Summary of meeting with CMS on 12/12/02

Present:
Claudio Grandi (VRVS), David Stickland, Lucas Tailor, Lucia Silvestris, Norbert Neumeister, Emilio Meschi, Veronique Lefebure

Distributed Analysis

Current approach

CMS has just published its DAQ-TDR for which a major distributed production and analysis effort was undertaken.

Production tools were developed, deployed and used to manage the work-flow, transfer and replicate data and publish results.
A Web interface to the production database (RefDB) allows users to get information about available datasets, their location and to query MetaData associated to them.
No further tools have been deployed to the physicists to assist them in the subsequent steps of the analysis (fetching data, building local federations, preparing analysis jobs, submitting them).
These operations (particularly those related to file transfer and federation management) requires specific skills and knowledges and, in some cases, system manager privileges.

The final steps of the analysis see the production of PAW ntuples or root-trees and their interactive analysis using PAW or ROOT.

COBRA provides the possibility to annotate events with a use defined object (tag) and to store them in collections. These user-collections have been used mainly to perform fast event-selection in batch.
Interactive event reconstruction, analysis and visualisation has also been routinely done using IGUANA.

Analyses has also been performed using Mathematica or Exel starting from root trees.

Short Term Plans

CMS plans to develop and deploy a new version of its production tools in preparation for next major production and analysis effort that will start in June 2003.
These new tools are integrated with EDG and VDT middle-ware in order to exploit and complement the EDG-1 software base.
Extension of these tools to support user analysis is foreseen on a similar time scale.

Medium/Long Term view

CMS considers essential the deployment of tools that will assist the physicists to prepare, submit and monitor jobs starting form high level descriptions of the task in hand. Required features are CMS considers that the core of these tools (if not the interface) should be in common with production.

Interactive Environment

CMS has not yet finalised his computing model. The role of desktop workstations is therefore not fully specified.
Still CMS physicists believe that interactive analysis of data locally stored will play a key role in CMS analysis model.

An interactive environment based on python has been deployed in 2002. It allows to query, manipulate and loop over event-collections and the associated metadata.
Access to the full C++ analysis environment was possible by dynamic loading the corresponding shared libraries. Full integration with Lizard was provided including interactive histogramming starting from the event-tag and from the standard event structure. Although deployed to physicists in late 2001, no real use of this environment has been made to produce published results.
The most used interactive environment has been native PAW or Root having as input files produced in batch.
No work has been performed to integrate the COBRA interactive environment with either PAW or Root.

A prototype of an analysis client-server environment (Clarens) has been developed using web services. It included Objectivity, Root and a RDBMS backends. It uses globus for authentication, authorisation and security.
As any other grid-based tool, it has not been deployed to physicists yet.

The current environment is clearly not satisfactory. Root itself is seen just as a starting point. CMS physicists do expect in future to be able to choose among a multitude of tools populating their desktop. It is essential that these tools can interoperate with each other.
CMS physicists do expect some sort of common intermediate analysis data format that can be understood by the CMS framework and all these tools.
There is the clear need to have the full power of the CMS simulation, reconstruction and analysis software system available in the interactive environment fully integrated with the analysis tools.
This environment should also be available for online monitoring.

Mixed feeling exists on the real need to run the exact same algorithms in batch (online or offline) and in interactive analysis. Final analysis code is seen by some physicists as disposable: re-engineering it is considered mandatory if it should be moved to a production environment.

AIDA

CMS physicists feel that a common interface to analysis objects and tools is required to guarantee interoperability. The current AIDA interface may be considered a starting point. Recent efforts of evaluating AIDA-3 has been frustrated by the poverty of documentation an support of it at CERN.

Phyton

Although python is considered, and indeed used, as a valid alternative to shell and perl scripting there is still mixed feeling of its effective use as interface for physics analysis.
As said python binding to COBRA down to the event level has been in provided since one year. Few physicists have used it and mainly to manage event-collections. More traditional unix-shell command-line interfaces and the C++ API have been more widely preferred.

Event Display

Iguana is routinely used as interactive event display. The current focus has been on real-life 3D rendering of the detector and of simulated- and reconstructed-objects with full interactivity.
Future work includes specialised 2D and 3D rendering.

Iguana developers have identified commonalities with Panoramix LHCb event-display and several features of the latter that they may wish to implement in IGUANA.
CMS would welcome a collaboration of the two development teams eventually under the umbrella of the LCG.

Final Consideration

CMS consider the access of physicists to distributed resources essential.
CMS considers this a critical area and source of possible competitive advantages.

CMS physicists share the view that the interactive analysis environment should allow to interact will all components of CMS simulation, reconstruction and analysis software.

Possible common projects:

References

CCS: Computing and Core-Software
PRS: Physics Reconstruction and Selection
Object Oriented Software: include links to Cobra, Orca, Iguana
DAQ-TDR
Analysis Workshop (7 November 2001) and its summary and conclusions
Production
Grid Integration


Vincenzo Innocente
Most recently modified on Thu Jan 9 18:21:02 MET 2003 by Vincenzo Innocente