LCG Web>ManagementBoard>WorkloadManagementTechnicalEvolution>WMTEGTopics (2011-10-31, TorreWenaus)

WLCG Workload Management TEG Topical comments

Modified on: 2011-10-31

WLCG Workload Management TEG Topical comments

Pilots and frameworks

Claudio Grandi

Given the current model of working of the infrastructure and the current needs of the experiments, pilot jobs are needed but we should be aware of the limitations and concerns, especially in the area of security, that working with pilot creates. Pilot jobs implement the important splitting of resource allocation and job management but the first part is fragile. We should address this for the short-medium term and understand how the model can be in a longer term to accommodate both "private job management" and classical job submission that many communities that are using the same infrastructure we are using will need.

Oxana Smirnova

Without any doubt, will be heavily re-designed, to provide application environment virtualisation, to use the hardware in the optimal manner, and to satisfy security requirements.

Convergence to a single framework for LHC experiments on short-term scale will be a massive investment of human power, but in the long run will save efforts, both in maintenance and deployment.

Ricardo Graciani

I'm ready to defend that they provide an extra

layer of homogenization, isolation and usage control for large user

communities (like those we are dealing with) that we will very

unlikely ever get from a common generic middleware.

Marco Cecchi

They are the result of a good idea, nevertheless the ability for big experiments to (re)build their own frameworks ground-up with their own people must be taken into account. AFAIK each experiment has a different interpretation and model and software about this concept.

They are sort of a shortcut to what has been designed over all these years to serve as a common framework for grid/distributed computing.

Sometimes, actually most of the times, this happened for a good reason. The fact that they are the trend in HEP nowadays, of course does not mean that the story about workload mamagement has ended. They also have significant drawbacks that rarely seem to be taken into account by HEP. Of course this will happen as long as advantages outnumber disadvantages. i.e. the free lunch, not caring about security, outbound connectivity etc. That's fair enough, of course.

Torre Wenaus

I've been a big fan of pilot based systems ever since I learned (in 2005) what LHCb were up to with DIRAC. For all the usual reasons. ATLAS experience with PanDA has been very good. The independent developments of pilot based systems in the different experiments happened for various good reasons but I expect there is scope for some convergence. At the pilot level itself glideinWMS could be a point of convergence.

Resource allocation and job management

Requirements for a CE service, evaluation of the needs for a (global?) WMS, requirements for public/private access to WNs (incoming/outgoing connectivity)

Claudio Grandi

Even in a new scenario where resource allocation is completely split by job execution we will need an interface to the site that is under the site's control. Call it CE or whatever but it will be needed because it would not be sustainable to have a "gateway" like service running on a machine on the boundary of the site under user responsibility instead of under administrator responsibility. Also an interface is needed to provide (on demand) dynamic information on site status.

Global (public) WMSs are services for those who want to use them. The model should not impose them but be compatible with them. In the longer term they may be needed to facilitate resource allocation, even though for current LHC VOs it may not be needed given the limited number of sites (~50-100 for big VOs) and the existence of clear pledges of sites to each VO.

Oxana Smirnova

This assumes the CE-WN model and centralised scheduling, which is not always the case even now, and probably will not be the case in future.

A safe assumption is that we don't know where and under which conditions the processing will be made, thus models must be independent from such details as availability of inbound connectivity.

Marco Cecchi

The need for a WMS comes essentially from the effectiveness of match-making in selecting dedicated resources, in terms of presence of required data, duration of the slot, processing power etc, as opposed to getting whatever it comes from the wild and then deciding if keep it or discard it or keep it busy doing nothing. WMS is not meant for opportunistic scheduling, as it is now, and we are interested in a possible evolution in this direction.

Despite all the claims by big VOs to dismiss the WMS in a short time, however, as far as I can see it is still there. I'd like to know why in the first place, because I just don't know. As far as I can see, some VOs are still interested in selecting resources in advance through an appropriate match-making process able to get slots according to a given 'work' (= processing power over a given time).

Besides, a WMS should be there to provide added value. In the current scenario, HEP doesn't seem to be much interested in that. I.e.:

Resubmission -> not needed with pilots, deals badly with sites that do job migration

Complex jobs, MPI jobs: -> not particularly appealing to the mainstream HEP community

Rich, high level JDL, configuration tips, parameter passing through the whole chain: something big frameworks are not interested into either.

Of course much of these points require a full featured connection with the LRMS and grid enabled submission chain, i.e. what CREAM+BLAH cover now.

Torre Wenaus

For PanDA, what we ask of a site in terms of CE is CondorG support. Or bare Condor. This has worked extremely well.

Global workload management is necessary but experience suggests we should not look to common middleware for it. ATLAS experience has been that having it integrated into the experiment workload management system has many advantages.

On WN connectivity, outbound http(s) is ATLAS' (only) requirement. We have not had problems getting that supported, it is friendly to security interests of sites (it can of course be proxied and often is).

Use of information services

Claudio Grandi

Needed to describe the topology (e.g. CE-SE association, resource size, ...) but not for scheduling purposes. For that each WMS system need to find its way of extracting information from the site gateways.

Oxana Smirnova

Let's hope that we will see the day when the distributed computing resources will all be adequately described and this information will be easily and consistently available. This will make a Grid out of our resources, finally.

Perhaps the simplest case is when every resource and service gets fully virtualised, and amount of information to be published is absolutely minimal, like, one URL for the entire Grid infrastructure (much like e.g. www.dropbox.com is today).

Di Qing

If it's not easy to describe the resource adequately, how about just publish minimal information? In this case pilot job itself finds out what's the resource available on the execution node and pull jobs which can run there. However, not all projects use pilot model.

Marco Cecchi

Goes along with the downsides of push-mode scheduling. Retrieval of dynamic information, in particular, must be done in a different way. But is that something that really must be done afresh each time a pilot lands? How much of these bits of information, both static and dynamic, are really retrievable in a more or less simple way by a pilot?

Torre Wenaus

Plenty of scope here for more commonality. ATLAS has made some efforts in that direction with AGIS, ATLAS Grid Information System, which is mainly an aggregator of other information sources. Ad hoc info sources, either primary or cache, are easy to create, and ATLAS has many. Efforts are underway to consolidate at the AGIS level, but potential is there to consolidate at a higher level at least for many things. Bureaucratic overhead has to be kept to a minimum, extensibility and flexibility must be well supported.

Stuart Wakefield

It strikes me that the combination of the push based job submission from the wms and its usage of the information system to guide this are a source of problems. And a not small part of the reason for experiments to write their own pilot job frameworks. The information system is quite complex and as more non hep dedicated resources are added this is hardly likely to be simplified. Thus relying on it to be the primary aid for the wms to make global scheduling decisions seems like it will always disappoint vo's in its scheduling results to a certain degree. If we could move to (or add in addition to the current model) a pull based system then we skip the need for the information system to attempt to present a uniform view of a complex and disparate system.

Security models

e.g. MUPJs, authentication and authorizaton - in collaboration with the Security TEG

Claudio Grandi

See the comments on MUPJs above. We need to define a responsibility model that is acceptable for sites already for the short term. In general I think we are underutilizing the possibilities offered by attributes for authZ.

Igor Sfiligoi

We have been talking about MUPJ for a long time, but the progress has been very slow.

I want to understand why and how to move forward.

I also want to discuss if the current tools (i.e. glexec) are the right thing.

Oxana Smirnova

I would like to make sure we don't develop own custom security solutions again. We have to realise that we use resources that we don't own, and therefore can not come with own rules - we have to always comply with local policies, if we want to get the service.

Security is there to protect the resource and also to protect users. Neither LHC researchers should expose themselves to attacks, nor should they create backdoors to expose the resources they use. Sounds trivial, doesn't it?

Existing open source solutions with many users are typically more secure, because they are scrutinised much more often than proprietary in-house hacks.

Torre Wenaus

The workload management systems themselves could play an important role in the security model. They hold the most information on user identifiability and activity. Standard info services/APIs to the WMSs could be used by sites and service providers as an effective means of obtaining rich information for monitoring, diagnostics and acting on security incidents. We have done this with PanDA and it's proven useful. It avoids some of the complications of relying entirely on the middleware layer.

New computing models

Claudio Grandi

Cloud computing may offer interesting concepts for building a model for resource allocation that solves the security concerns on MUPJs.

Depending on the resource allocation model "whole node scheduling" may be an aspect that simplifies the view, but it is not necessarily bound to it. I'd leave this, together with virtualization, parallelization, etc... to a later stage of the discussion.

Igor Sfiligoi

I see this coming along, and we may need to adopt them if we want it or not.

The nice uniformity of the past few years is likely to go away, and we should make sure both that our tools can handle it.

An interesting question is: Should we hide it from the user applications? If yes, how and how much?

PS: We already do something in glideinWMS, but it is quite limited for now.

Ulrich Schwickerath

On virtualization, I think it makes sense to acknowledge the activity of HEPiX in this respect, where a lot of work has been done specifically on policies.

Oxana Smirnova

Nothing really new here - if LHC communities can not provide portable application software, they will need to make full use of virtualised application environments.

But if application developers will seriously decide to invest effort in re-implementing everything for e.g. GPUs, they should use this opportunity to develop actually portable software. This will reduce dependency on virtualisation technologies.

Di Qing

The same point as others. Most of sites support multiple experiments or projects, for sure, those projects always have different requirement, thus virtualization seems a reasonable solution. Some of our tier2s created different OS images for different projects. When some nodes allocated for one project, those nodes will be booted with the OS image for this project. This can be a solution too if it can be done automatically.

Ricardo Graciani

I think that the evolution of the hardware forces

us towards the design of new "application" frameworks that allow

efficient use of many-core systems in a much more efficient way as the

current n-single thread approach that is dominant today.

Torre Wenaus

Exploiting virtualization and ensuring we can effectively leverage cloud resources are important, as is being able to utilize multi-core efficiently.

Background materials

glideinWMS: http://tinyurl.com/glideinWMS (Igor Sfiligoi)
gLite WMS: http://web.infn.it/gLiteWMS/ and http://wiki.italiangrid.org/WMS (Marco Cecchi, Massimo Sgravatto)
CREAM computing element http://wiki.italiangrid.org/CREAM (Massimo Sgravatto)
CEMon service http://wiki.italiangrid.org/CEMon (Massimo Sgravatto)
PanDA: https://twiki.cern.ch/twiki/bin/viewauth/Atlas/PanDA (Torre Wenaus)

-- TorreWenaus - 25-Oct-2011

Topic revision: r2 - 2011-10-31 - TorreWenaus

LCG Wikis

LCG Service
Coordination

LCG Grid
Deployment

LCG
Apps Area

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
LCG All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback