Summary of ARGUS Future Workshop, December 11, 2014 (CERN)

Agenda

https://indico.cern.ch/event/348018/

Introduction - D. Groep

Open meeting, semi-structured agenda

  • Why are participants here: painting the Argus/Authz landscape
  • What do we see Argus doing to addres the issues
  • Who is ready to participate to the effort for what could be decided and has the technical skills

Why attending this workshop?

R. Wartel

  • WLCG wants a central suspension solution
  • WLCG would like to see at least interoperability if not a common solution on both sides of the Atlantic

D. Van Dok

  • In charge of operations at NIKHEF, found critical to have a tool to control who is using my site
  • Not caring that much about ARGUS itself but something is needed

Misha

  • Developer of glexec and lcmaps, user of ARGUS
  • Also developed an ARGUS backend service, ARGUS-EES
  • Worked on central suspension with ARGUS
  • Concerned about interoperability both sides of the Atlantic

Manuel G

  • In charge of running ARGUS services, critical to have support

M. Jouvin

  • GDB chairman, at the origin of this workhsop
  • Idea is that to solve support we need to first see what long term interest in ARGUS and if a community can be built around it if this is our technical solution for our requirements
  • Need to revisit our response to authz requirements as the landscape evolved/is evolving, in particular with federated identity
  • Fine grained authz and banning is a required feature for a large scale distributed infrastructure

J. White

  • Involved in ARGUS history during EMI
  • An ARGUS fan! Did some work to use it in a cloud/OpenNebula context in a Finish project (but now leaving)
    • Developped an OpenStack plugin for ARGUS-EES

I. Collier

  • Run the team in charge of grid services, hear the complaints of the people involved in its management about the complexity of the service
  • Would need an improve capability for users remote from RAL to administer the central suspension service

D. Kelsey

  • Involved in several security coordination activities
  • Fan of ARGUS but not really concerned about the exact tool: the features are what is important, in particular the central suspension
  • Importance of fine grained control and traceability
  • Should mention I. Bird's question: can we move to a common implemention of the service we need both sides of the Atlantic

V. Brillault

  • Main concern is central suspension capability

A. Sciaba

  • WLCG Ops Coord coordinator
  • Concern is long term support of ARGUS

U. Tigerstedt

  • ARC support for ARGUS is not good and difficult to fix it without a proper support from ARGUS
  • Performance issues with big clusters

Cristina

  • UMD release manager
  • Central suspension is a critical feature for EGI
  • Would like to see ARGUS able to manage non X509 credentials

A. Ceccanti

  • Involved since a long time in security developments, in particular VOMS and ARGUS PAP (PAPd + pap-admin)
  • At CNAF, involved in PEP backend development
  • An ARGUS fan!
  • Think that performance issues/problems can be solved: no work happens so far on this
  • Managing the security support unit in GGUS and well aware of standing issues: need to find a solution handle this, INFN can play a role but not alone

Maarten

  • Involved in the early design of the service... and in fact responsible for the name!
  • Reviewed last Valery Tschopp presentation at July 2012 GDB and thinks it a good starting point to identify challenges and why ARGUS is a required service. Critical features:
    • A policy editor easy enough to use and allowing for hierarchical policies
    • Designed for scalable deployments and high availability: not clear that current performance problems seen at some sites are linked to bad design (may be just bugs!)
  • ARGUS sustainability is important: sites will pay the price if it disappears

B. Bockelman

  • OSG relies on GUMS for a service similar to ARGUS
  • GUMS not maintained since 3 years!
    • OSG now took over the maintenance

P. Solagna

  • ARGUS proved to be a good base for an emergency suspension service
  • Confidence on long term support is important to get the service widely deployed

D. Groep

  • ARGUS has a very good abstraction of an authorization framework
  • Need to integrate with frameworks developed in the federated identity world and IOTA low-level of identity vetting
    • ARGUS is addressing authz from the service provider side where new frameworks are concentrating on the IdP side: may be complementary
    • Remains a pretty unique solution
  • Would be major step backward to move back to grid mapfiles
  • Central suspension is a major, required feature
  • Already have a lot of experience with ARGUS, probably not a huge work to improve things that would make a lot of people happier!

Proposed topics for discussion

  • Short presentation of ARGUS and GUMS architectures
  • New challenge with identity federation
  • Operational experience
  • Support issues and long term support

ARGUS and GUMS Architecture

GUMS - B. Bockelman

A site service made of a Tomcat app with (SOAP) interfaces in front of a database

  • Administrative (web)
  • CLI
  • LCMAPS callout to GUMS using the lcmaps-plugin-scas-client
    • The same interface is used by a Java-based client used by BestMAN for authz through a callout to GUMS
  • Database: cache for VOMS information + local (non VOMS) information
    • In the future will do it VOMS checking directly in the GUMS/LCMAPS client and not go to GUMS but not yet the default config
    • Every site with a GUMS server has a local copy of the VOMS data for the supported VOs

Not having a generic policy engine but rather a set (chain) of hardcoded matching lines

  • Mapping rules defined by assigning users based on some pattern matching to a GUMS group and calling an account mapper to assign an actual account
    • Several type of mappings available including one account shared by a group or manual mapping that uses pre-existing accounts

A banning list is checked first and can retrieve information from 2 external sources in addition to local entries

  • VOMS banning list
  • ARGUS banning list (through SOAP)
  • Updated every 20 mn
  • No GUMS server hierarchy: each site has to configure to appropriate external sources

Demonstrated a couple of 100 Hz of mapping

Can be run in HA mode.

Each site has to maintain its own configuration but a feature allows to merge changes from a reference configuration

  • Does not prevent site config deviation... but helps to maintain a minimum of consistency

OSG view on central suspension is that it should be a capability implemented by the VO and that if a VO doesn't have/implement this capability, it risks a VO suspension in case of problem.

ARGUS - M. Litmaath / A. Ceccanti

See V. Tschopp's presentation at July 2012 GDB

Authz question: Can user X perform action Y on resource Z

Motivation for ARGUS: have service common to all services to implement the authz in a consistent way instead of the existing jungle

  • Generic authz system built on top of a XACML policy engine
  • Running at scale at the heart of the design: reason for the PEP that is a client component on the server side to allow a very lightweight protocol between clients and server
    • Transform lightweigh requests into XACML
    • PEP applies some filters and then pass to PDP for the actual authz decision
    • Caches responses (but a bug prevents its usage)
    • As few dependencies as possible to reduce maintenance troubles with unmaintained dependencies
  • Policy administration: PAP. For humans to administer XACML policies
    • A PAP can provide policies to other PAPs: allow to define a hierarchy
    • End decision to sites
    • Performance not critical
  • Policy Decision Point (PDP): one of the critical component for performances
    • Main access to PDP is through PEP but direct XACML API is also avaible
    • Concerns leading to use PEP are probably no longer a major problem (in particular bandwith)

ARGUS policies can give 2 answers: PERMIT and DENY

  • As a central service, could provide an alternative to a gridmapdir
  • Policies expressed in SPL (Simplified Policy Language) to hide XACML complexity

Management interface: pap-admin

  • Command line tool to define policies and ban users
  • No web interface

Implementation language: Java for everything except the PEP C library

ARGUS architecture is not bound to X509.

  • SAML-ready

Federated Identity Challenges

Moving to a world where there are multiple sources of user attributes that have to be composed to take authz decisions

  • Main technology players: SAML and OpenID
  • Bags of attributes: close to VOMS SAML...
  • Currently FedID world tend to do authz through authentication

ARGUS has some strong selling points

  • Already gets SAML support (built on OpenSAML)
  • Building blocks for trust composition already exist
  • PIP receiving push and pull attributes
  • If ARGUS was able to consome attributes other than X509, may be connected to the site SSO

Is Shib an alternative solution?

  • Not covering the whole ARGUS service: only providing a bag of attributes
  • Unique ARGUS value: a central point, service agnostic, allowing to define authz policies based on attributes defined externally
    • Shib can be one of the attribute provider for ARGUS

OSG partially outsources authz to Globus Online through a transitive trust relationship

  • Pay G.O. for this service
  • OSG VOs other than LHC ones: users no longer directly exposed to VOs
  • OSG ready to leave with a different authz service for non X509 services and for legacy X509 services
  • One of the main difference between OSG and Europe: OSG focuses on internal challenges/issues where Europe is more focused on building a European-wide infrastructure for ERA

Andrea: need to look at open source solutions but commercial ones also exist like Forgerock

  • Offering features very close to ARGUS, implementing similar ideas

Banning for site specific reasons and central emergency suspension are different use cases both valid and than can be implemented using the same technical mechanism

Summary: ARGUS provides a lot of what we need in the FedID but we need to sort out the technical details if/when we move there.

  • Might need to convey the assurance level for the attributes obtained
  • Also take care not to make things overly complex
  • Adding XACML3 support may help to interface with more services
  • Opportunity to make ARGUS useful in this space probably requires progress in the 1-2 year timeframe

Deployment, Performance and Scalability

Salability: several sites with large, scalable ARGUS service. But some report problems.

  • To progress need to make a list of the sites with problems and tackle them one by one

CERN : main issue is that when it begins to go wrong, troubleshooting of problems is difficult and normal behaviour is often restored without a clear view of what fixed the problem

  • Manageable of incidents so far
  • Some tricky issues were identified and sorted out/fixed by Valery in the recent months: now running some unofficial release
    • Some issues caused by insufficient resilience of OCSP responders when OCSP was switched on by some CAs: also involved CANL issues

Looking at sites reporting problems, several opposite recipes to solve a problem, e.g. Java version working

  • Often linked to the usage of not very actively maintained Java package: does WLCG should take ownership for the maintenance of these pacakges?
  • Neeed to collect more precise data on the site load that triggers problem
  • MW Readiness effort/infrastructure could contribute to stress testing but need to find an inventive way to implement these stress tests

gridmapdir is another source of problems: can we recommend sites to rely only on Argus and forget about the gridmapdir

  • Need to documnent how to provide a consistent mapping between several ARGUS servers mounting the gridmapdir on the ARGUS servers rather than on the WNs/CEs
    • Used at CERN successfully
  • Can we envisionned a consistent mapping between ARGUS WAN-distributed?

No very accurate number about the ARGUS deployment status in EGI but probably pretty well deployed

  • More than 100 sites
  • CMS sites need to have it as glexec is a critical test
  • Some NGIs (like the French one) require it at each site for emergency suspension

ARGUS code on GitHub but documentation still on Twiki with no left developers having access to it!

  • One possible action could be to move the documentation to GitHub pages: could be a collaborative, background effort

Misha: known issue in SL6 with libnss that perform must worst than OpenSSL (SL5)

  • If providing alternative library, don't forget deployment issue (do not replace standard libraries)

Make clear that issues must go through GGUS

Interoperability and Convergence between OSG and Europe

Not the same use cases, converging seems difficult

  • Also the limited manpower available on both side makes unlikely a change on either side

ARGUS seems clearly an architecture with more potential, with the ability to play a role in the FedID world and its development is desirable

  • Using GUMS as a replacement (or at least for some of the components) would probably be a viable plan B but will means renouncing to addressing this new use case
  • Migration to GUMS will be a lot of work for the migration and will break the backward compatibility with authz infrastructure in Europe

Interoperability is already there, as demonstrated by Brian

  • A few issues still to sort out, like ability to define PAP ACL allowing an arbitrary GUMS site to retrieve information from ARGUS (banning information)
    • Idea to use OIM as the source of legitimate GUMS site would not work as GUMS is not registered in OIM

Available Effort for ARGUS Support and Development

INFN: agree to support PAP and can contribute to PEPd and PDP if part of projects that may need the new features discussed (in particular H2020 cloud-related project, answer expected in the coming month)

  • Could easily create a RPM for pap-admin only
  • A web interface to PAP would be useful

NIKHEF: C PEP lib and client, GSI callout

EGI ready to contribute to documentation

Scale testing as part of the WLCG MW Readiness activities

Main component left unsupported: Java PEP client, used by CREAM

  • In fact quite a lot of commonalities with PEPd

Possible other players: should find some way of recognizing their contribution

  • CESNET / Zdenek : express interest in testing
  • Krzysztof Benedicak: Java CANL developer/maintainer, also linked to Java SSL, may be contacted to participate to Java PEP client

Collaboration required with ARC security experts about ARC integration issues

  • Explore going through PEPd rather than talking directly to PDP directly

J. White now doesn't have a clear idea about its possibility to contribute but he can, he'd like to continue the work started on EES (PEPd backend to do various things, including instantiating VMs)

Next Steps

Advertize that ARGUS will live

  • Already a few commitments for contribution to support and testing
  • Will look for others to make the community larger

Concentrate on high priority issues first

  • Not that many known...
  • Sites invited to report their problems in GGUS
  • Community will act on a best effort basis, based on good quality bug reports

EGI and WLCG will carry the message around

Take ownership of the ARGUS support list (Google groups): 2nd (3d?) level support

  • Need new supporters in the coming months
  • In fact Andrea, already a manager of this list
  • Vulnerabilities: use the SVG list as for other products

Issue tracker currently on JIRA: move to GitHub

Documentation: move to GitHub pages

Developer cordination

  • Create a separate mailing list
  • Documnent the contribution workflow (pull request...)
  • Define a release model

Summary of the workshop at January GDB by Maarten

Next meeting by Vidyo end of January/beginning of Feburary

  • After results of H2020 projects from the Sept. call

Communication between attendees of this workshop (and possible interested by discussion the ARGUS community and future): keep a separate list

  • Can be the existing workshop list or another Google group

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2014-12-12 - MichelJouvin
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback