5.5 Distributed Analysis on the Grid

The aim of the distributed-analysis project is to enable individual ATLAS users to use Grid resources for their analyses in physics and detector groups. Up to now the use of Grid resources on a large scale has been demonstrated only in the context of large-scale simulation and reconstruction of physics data. These activities are typically supported by small teams of Grid experts. The complications of computing on a global scale have been prohibitive so far for individual users and smaller groups, and they have not yet been able to take advantage of these resources.

Nevertheless, the computing needs after the start of LHC will require the use of Grid resources to support analysis on the required scale. In the following sections the ATLAS-specific aspects of distributed analysis will be presented in more detail. The requirements for data management and workload management for distributed analysis are discussed, and the various ATLAS activities are then presented in turn. Their current status is reviewed and future developments are outlined.

5.5.1 The ATLAS Distributed Analysis Model

Many aspects of Grid-based analysis have already been discussed in detail by several working groups [5-20]-[5-23]. These considerations are clearly relevant for the ATLAS activity; to complement these discussions, some ATLAS-specific aspects are presented. For the purpose of this discussion, we group together analysis tasks and any other medium-scale computing activity, such as individual testing of new simulation, calibration or reconstruction algorithms before putting them in production.

The need for distributed analysis follows from the distribution of the data in various computing facilities according to the ATLAS Computing Model, and by the availability of the CPU resources required to perform the actual analysis on large datasets. An analysis job will typically consist of a Python script that configures and runs a user algorithm in the Athena framework, reading from a POOL event file and writing a POOL file and/or ROOT histograms and/or n-tuples. More interactive analysis may be performed on large datasets stored as ROOT n-tuples. The work model of the different physics groups may vary within the ATLAS Collaboration, as it may be strongly influenced by the size of their relevant "signal" and "background" datasets; the distributed analysis system must be flexible enough to support all work models as they emerge.

Some important requirements on the distributed analysis system are the ease of use by collaboration members, the robustness of the system (including its interface to the Grid systems), and job traceability. In addition, the look and feel of the system should be the same whether one sends a job to one's own machine, a local interactive cluster, the local batch system, or the Grid.

5.5.2 Data Management

The ATLAS Data Management Project is responsible for the organization of the datasets at the various computing facilities according to the ATLAS Computing Model. A first model using local file catalogues and a global dataset location service has been outlined in Chapter 4. In this model a dataset may be distributed to a number of computing centres; a still open question is the integration of the dataset location service with the workload management system.

Users have to be provided with mechanisms for "random access" to relatively rare events out of a large number of files, in order to implement fast pre-filtering based on selected quantities (trigger bits and/or reconstructed quantities). In ATLAS a first iteration of the TAG datasets is in preparation and under initial tests. Selecting events from this TAG database, and then accessing the underlying event data base, will form an important verification of the capabilities of the data management system.

A further question to be studied is the storage of user data in the Grid environment. A large amount of data resulting from analysis jobs will have to be stored and managed. It is in the nature of the Grid that these datasets are not centrally known and require therefore an automatic management system. This implies the concept of file ownership and individual quota management based on this ownership. While many ideas have already been presented, practical implementations have yet to be demonstrated.

A final issue is the potential mismatch of user data management between the Grid environment and the local computer centres. Data that have been produced by analysis on the Grid will need further refinement by local interactive analysis. Transparent access to these data has to be provided, at least for data residing on storage elements close to the user.

Given the importance of data access to distributed analysis, there has to be a very close relation between the projects. Distributed analysis relies on the tools provided by the data management and has to verify that its requirements are taken into account. While many discussions have taken place and clear requirements for the interplay of the distributed data management system and distributed analysis have been identified, very little has been prototyped and tested so far. The second half of 2005 will see the implementation of the distributed data management system (described in Chapter 4) and the first tests of remote data access for user analysis tasks.

5.5.3 Workload Management

Assuming that the location of the data (or datasets) will determine the place where analysis is performed, it will be relatively simple to organize the overall analysis activities by placing the data according to the plans of the ATLAS Collaboration. This means that user jobs will be executed where a copy of the required data is already present at submission time. The problem is then just the normal fair-share of a batch system, which is by no means a trivial task if multiple users or working groups are active at the same time.

As pointed out in the HEPCAL2 document [5-22], there are several scenarios relevant for analysis:

analysis with fast response time and high level of user influence;
analysis with intermediate response time and influence;
analysis with long response times and a low level of user influence.

As discussed in See D. Barberis et al., Common Use Cases for a HEP Common Application Layer for Analysis - HEPCAL II, CERN-LCG-SC2-2003-32, October 2003, it is the second scenario that is likely to be the most interesting and challenging. One can assume that this will cover most analysis activity.

Of particular interest are models that provide some sort of interactive response. In ATLAS the DIAL project [5-24] has been pioneering such fast response by low-latency batch queues. Some projects are studying approaches that go beyond the batch model, as in PROOF. A similar example is also the DIANE [5-25] model that is discussed below. It has to be pointed out that many aspects of such a model, such as resource sharing, have not yet been demonstrated in a realistic scenario.

5.5.4 Prototyping Activities

ATLAS users have not yet been provided with a common environment that gives access to ATLAS resources on the Grid. On the other hand, a number of prototyping activities have been developed within the ATLAS Collaboration, some of them in cooperation with national Grid partners; they are presented in turn in this section. Although none of these activities have provided ATLAS with a complete system that satisfies the requirements outlined above, they provide some of the building blocks with which we can construct the ATLAS distributed analysis system before the end of 2005.

5.5.4.1 AMI

The ATLAS Metadata Interface project provides metadata services to ATLAS (see Section 4.7.2 ). According to its design principles, AMI uses an RDBMS and provides access to metadata by a Java API. A web service that provides for clients written in different programming languages has also been developed. For distributed analysis the user wants to select datasets based on some metadata, and retrieve the logical filenames forming the dataset. Recently new requirements for metadata have been defined [5-26]; an iterative process that includes interactions between the database developers and the distributed analysis community is necessary to improve the functionality of AMI for analysis applications.

5.5.4.2 DIAL

The DIAL project [5-24], supported by PPDG in the US, aims at providing a high-level service that simplifies access to the Grid for the user. The user submits a high-level task to the DIAL scheduler. The scheduler in turn takes care of splitting the job request and submitting the resulting jobs to the corresponding infrastructure, local batch or Grid. The server also provides additional services such as merging of result datasets, n-tuples and histograms. DIAL aims to be a comprehensive solution that provides access to Grid resources and to metadata catalogues from an integrated command line interface in ROOT and Python. A GUI is under development in close collaboration with the GANGA project. Recently the implementation has progressed significantly and essential features are now available; services submitting jobs to local LSF batch queues were put in place in BNL and at CERN in May 2005 and are currently under test.

5.5.4.3 Production System for Analysis

While it is possible to run analysis-type jobs through the production system, not all functionalities for supporting distributed analysis are currently available. At this point it is necessary that users run their own instance of supervisor and executor under their own control. Clearly there is also the need for development in the area of graphical and command line interfaces to hide some of the underlying complexity. On the other hand, the production system has been demonstrated to scale to several thousand jobs in parallel.

A related activity is the development of tools to control the production of the ATLAS combined test beam simulation and reconstruction; these tools have been evolved to include job monitoring and verification features. These developments are closely related and similar features will be available for Grid users.

5.5.4.4 GANGA

GANGA (Gaudi, Athena aNd Grid Alliance) is a joint project between ATLAS and LHCb [5-27], supported by GridPP in the UK. It is a user-centric interface for Athena job assembly, submission, monitoring, and control. GANGA is interfaced to various submission systems such as batch clusters (LSF, PBS), Grid infrastructures (LCG-2, gLite), and also experiment-specific production environments (DIRAC of LHCb). A variety of analysis, reconstruction, and simulation applications is supported by an extensible plug-in mechanism. An important aspect is the integration with the development environment that simplifies the definition of jobs. Additionally GANGA helps a user to keep track and history of jobs by saving input and output files in a systematic and coherent way. There is the possibility of running this job registry on a remote server, which enables users to monitor their jobs from various locations (desktops at work, laptops at home, etc.).

The upcoming release of GANGA 4 will be the basis for an extensive user evaluation in the ATLAS community.

5.5.4.5 DIANE

DIANE is a lightweight distributed framework for parallel scientific applications in a master-worker model [5-25]. It assumes that a job may be split into a number of independent tasks, which is a typical case for ATLAS analysis applications. The execution of a job is fully controlled by the framework which decides when and where the tasks are executed.

DIANE is based on a pull model: workers request tasks from the master. This approach is naturally self load-balancing. Additional reliability comes from the master which occasionally checks if workers are still alive to detect silent application crashes. DIANE workers act as daemons interacting with the master, which is a direct way of implementing interactive data analysis.

The current prototype of DIANE has proved useful for distributing ATLAS analysis tasks within a local cluster in the Swiss computing centre [5-28]. This functionality is important for the users and will likely continue to be needed also in the future. Distributed interactive analysis is simplified by working in the user's account and taking advantage of a shared filed system. It would be beneficial to have the local cluster functionality preserved and augmented. Work on the integration of this framework to Grid middleware is in progress; so far the possibility of running simulation jobs in parallel on distributed LCG-2 resources has been successfully tested. DIANE could complement batch-oriented systems by providing an interactive job execution with Grid resources on a large scale.

5.5.5 The ADA Project

ADA, the ATLAS Distributed Analysis system, has as a goal to enable ATLAS physicists to analyse the data produced by the detector and reconstructed by the production system, and to carry out production of moderately large samples. The system must perform well and be easy to use and access from the expected analysis environments (Python and ROOT) and flexible enough to adapt to other environments that might later be of interest.

Although ADA is being developed in the ATLAS context, the underlying software packages such as DIAL, GANGA and AMI have little dependence on other ATLAS software. Care is being taken to keep the entire system generic to allow use in other contexts that provide the required data and application wrappers.

The core of ADA consists currently of the DIAL services. Datasets can be imported from the AMI metadata database to the internal DIAL data catalogues so that they are readily and efficiently usable in the DIAL context. DIAL services can split jobs and submit them to local LSF batch systems at BNL and CERN; DIAL will also keep track of the jobs, collect the outputs, and make them available to interactive user sessions.

ADA will also provide its users with a graphical interface to aid in examining data and specifying, submitting and monitoring jobs. This work is being done by the GANGA project, making use of the Python binding to DIAL.

ADA had its first public release in April 2005. There are two analysis service instances running at BNL (one for long-running jobs and one that provides interactive response) and one at CERN. ADA has clients to enable use of the AMI service but so far little of its data is resident there. All of the reconstructed data from the first ATLAS data challenge and part of the datasets produced since are available as ADA datasets; work is in progress to complete the catalogues with all existing ATLAS data. Transformations have been defined to allow users to access the data and insert their own analysis algorithms.

ROOT and Python clients are available providing the full DIAL functionality as described earlier. These and the above analysis services, plus the AOD analysis and the combined n-tuple transformation, are the bulk of the functionality available in the first ADA release.

As of June 2005, we are in the process of re-evaluating the experience we gained so far with the development of the DIAL-based ADA prototype. Several related projects, described above, have to be brought together in a coherent way so that the end-users will not have to change work model, or environment, depending on whether they want to run on a local machine, a local interactive or batch cluster, or the Grid.

A large amount of simulated data, produced for the June 2005 Rome Physics Workshop, and of real data, produced by the 2004 combined test beam, is distributed around the world and is now potentially accessible using Grid tools. These datasets will be used during 2005 to test the different options outlined above, and to define the baseline solution to be developed further.