Rollout of handling job priorities, VOViews, etc
Introduction
Notations
-
VO
= Virtual Organisation
-
ACBR
= (Glue)AccessControlBaseRule, an attribute in GlueSchema to describe accessibility of a given object. (CE, SE, etc)
-
VOView
= VOView blocks are elements of GlueSchema to describe access rights and other properties of an object from the point of view of a specific group. (VO, group inside a VO or individuals)
-
Now
= 19 jun, 2007
Origin of the problem
- Sites participating in EGEE/LCG are supporting different VOs with different shares, different priorities, what's more
- members of one VO having different group membership or role are also supported, handled differently, thus
- all this information should be available during matchmaking in the infosys in order the job to be sent to the most appropriate place. This is to be enabled by the VOView blocks.
Present situation
Timescales for different actions are understood from,
Now
, i.e. from 19 jun, 2007.
As of 19 jun, 2007
- The lcg-RB ignores VOViews, it uses only the
ACBR
information of the GlueCEUniqueID
block.
- The glite WMS (will) interpret the
VOView
blocks and taking into account it's content during matchmaking.
- The infosys now contains the
VOView
blocks for all the VO names and VOMS FQANs defined in the GlueCEUniqeID
's ACBR
fields,
- which - due to lack of agreement on interpreting inclusiveness, exclusiveness of
VOViews
and ACBRs
- results an inconsistent interpretation of it's content and occasionally erroneous matchmaking by the WMS.
Very short term action
Time scale
Plan
- Rejecting previous versions of YAIM which has 'erroneous' VOView configuration.
- Releasing new version of YAIM (3.0.1-22) which does not configure VOView blocks which are refeering to VOMS FQANs, only VOView for VO names will be configured.
Problems to solve
Status
- The status and content of this version of YAIM is available on the YAIM Planning page
- This very short term action has been associated with YAIM version 3.0.1-22, patch #1200
Short term plan
Time scale 2
Plan 2
- Reimplementing configuration for VOViews for VOMS FQANs.
- Implementing a feature which enables/ensures the possibility of correct interpretation of these VOView blocks.
Problems to solve 2
- the lcg-info-dynamic-scheduler has to be fixed in order to allow the unordered appearance of
ACBR
attributes within a VoView
and/or GlueCeUniqueID
block. The LDAP mapping of the Glue Schema does not specify any order, so one specific order should not be enforced by the dynamic scheduler. See status here, bug #27517
- the WMS has to interpret the VOView blocks
- the should be an agreement on the syntax of the
ACBR
attribute
- For the moment
VO:
and VOMS:
has to be preserved for backward compatibility with lcg-RB.
- there should be an agreement on the meaning of VOMS FQANs
- the VOMS FQANs in the ACBRs has to be interpreted in the same way as they are interpreted in the grid-mapfiles. Info here, status here, bug #27545
Questions to clarify 2
- Does the WMS interpreting old style and new style VOMS FQANs, as well ?
- How and which one of the wildcards are interpreted in a VOMS FQAN ?
- Clarify the inclusiveness of ACBRs
- Clarify the group hierarchy of VOMS attributes
Proposals 2
- Problem #3 There should be no syntactical difference between VO names and VOMS FQANs, and only new style VOMS FQANs should appear in the infosys.
DIvers 2
- The name of the VOView doesn't have to be the same as the value of it's ACBR, the only requirement that it has to be unique inside a GlueCEUniqueID.
- On SLC4, because newer version of openldap enforce stricter schema checking, '=' signes cannot be part of a
dn
.
Status 2
- This short term action has been associated with bug #25693, you can check it's status there.
Comments 2
- Added by Stephen Burke:
- On problem 5, I think it's essential to have an interpretation consistent with the LCMAPS mapping, not just preferable - if we don't have that we may as well not use the VOViews at all! Of course that doesn't mean that every detail has to be represented, e.g. the scheduler may have some level of fairshares at the individual user level but in general it won't be feasible to publish at that granularity.
- On question 3 about inclusiveness, my view is that published views should be exclusive. Specifically, each view should correspond to some defined share in the LRMS (typically mapped to a unix group, but not necessarily). Since a given job will be in one and only one scheduling class, those views will be exclusive by construction.
- There is also a question of completeness, i.e. whether every scheduling class should have a view. I am inclined to think that it would be a good idea for the views to be complete but I don't think it's absolutely essential, as long as the WMS will fall back to the generic CE information if there is no matching view.
- On the Proposal and more generally: I would like to remind people that the ACBR syntax is part of the GLUE schema, and the purpose of that is to allow different Grids to interoperate. We should therefore not introduce a syntax which is specific to LCG/EGEE and cannot be interoperable. The reason to have the VO: and VOMS: prefixes is precisely for that: in general not all Grids will be using VOMS, but the concept of a VO is probably generic. Other authorisation systems could then have different prefixes. I therefore have no problem with a proposal that LCG/EGEE makes a decision to only use FQANs in ACBRs, but I think it's essential that the VOMS: prefix is still used.
- This also applies to the proposed DENY syntax: the suggestion was to use something of the form DENY:FQAN, e.g. DENY:/atlas/Role=Production. However, this is not interoperable with grids using other authz systems because the DENY applies only to FQANs, I would like a syntax which makes it explicit what is being denied. I would be inclined to suggest something like DENY:VOMS:FQAN, although I believe the current syntax assumed by the WMS and info provider only allows for one colon. At any rate the syntax should be defined precisely and not left to individual interpretation.
- The GLUE working group (specifically Laurence Field) is currently collecting use cases to inform a new version of the schema, and I think it would be good if this case could be included there.
Long term solution
Time scale 3
- As soon as there is a good idea
Plan 3
- Find a bit clever way of publishing VO/group specific scheduling information.
Proposals 3
- The concept of queue should disappear from the information system,.
Links
-- Main.gdebrecz - 19 Jun 2007