CERN active from start with commercial cloud services
Microsoft Azure work published
Deployment with T-Systems ongoing, Sept-Oct
Hybrid model
Procurement process not matched to commercial services
PICSE came up with recommendations
Pre-Commercial Procurement: 1- research organisations
ESRF joined consortium
>1.6M€ procurement funds
Some manpower commitment
Focused at infrastructure level
Get fundamentals right
Working with systems managers, IT staff rather than solely users
WLCG, other physics, all sponsored.
Needs <-> Provision use cases; risk assessed
Results documented
Payment models; which model is best for each use case
Don't replace orchestration of experiments
Running through to end of 2018
Most Economically Advantageous Tender (not necessarily cheapest, best quality)
Q: Are sure that someone will be able to live up to requirements?
Reason for open market consultations. Companies rated themselves the difficulty. A number already have put in SAML based identity systems. They have to see there's a market.
Length of each one, 3 months design, more for prototyping...
Ian N:Down the road?
What's good for community - looking to sign for annual period, a year-ish
Ian N: Not a way of providing on-demand?
Thinking of framework agreement, actual sum you would pay would depend on consumption
Q: if I compare current two procurements, difference in way experiments interacting. Train more going towards talking to central then fanning out
Not always a technical reason for that. What if failure of company. Who handles mess? Intermediate layer shields experiments. Contract management etc...
Ian B: Really through batch system, through condor, so immune to what's underneath.
Q: Operationally, day to day, I want to know that the cloud infrastructure, 50 nodes at place A are not working and want to disable
Ian B: Yes - we'd want more or less transparent extension of T0.
Maarten: Fed Ident? Companies would have to match each other because that's what this means. How much does this buy us since we'll need our own ident, voms etc.
Not going to create accounts for each user for each company. Doesn't mean will replace infrastructure or take away work of integration
Ian B: Not see this being used in WLCG only. For WLCG overlay as currently looks. Also for individual user for other areas come with institutional id.
BDII and Information Systems (Maria Alandes Pradillo)
Representation of IS today, tools interacting with it
LHC dependencies, help us to understand complexities, define doesn't match consumption
All depend on BDII, Top-BDII, rebus
Only ALICE use dynamic attributes
BDII pros and cons
Cons: info quality, do not validate, effort to do post validation, in the end VOs don't trust
Proliferation of home made IS
Discussing: CRIC system [Computing Resource Information Catalogue]
Central CRIC + expt CRICs
API to applications like monitoring to query CRIC etc.
Experiment specific CRICs (ALICE/LHCb), very basic topology information like site names
Consider whether we stop relying on BDII
Stopping dependencies on BDII discussed
Need to document plans to stop dependencies and where to get information
Until CRIC is there, not going to change anything
Known issues (capacities in REBUS)
Waiting for new system in place making sure info reliable...
Extra slides on info sources, etc...
Michel: Compared to previous plan, is it the same idea?
CRIC using some concepts from AGIS, new thing, incorporated CMS needs/systems. AGIS devs + new devs. AGIS code base as starting point
Maarten: Advent of CRIC would be good thing, first time nice consistent overview of what WLCG is. Always been missing. Good thing, proven AGIS technology + CMS requirements, plus discussions in TF. Generic, flexible tool. Hopefully not controversial. Making ourselves independent, we don't have to worry if the BDII data is of bad quality, nice evolution of technology.
John Gordon: Based on Open Source?
Python, Django
John Gordon: Other people use it?
The idea is to make something quite generic able to be adopted by other communities
Maarten: beware of potential scalability concerns; with our use case in mind, not going to be queried 100 times a second. Specific to ATLAS, made more generic, now in principle capable of seeing variety, be careful what you'd want to use it for
John G: Make it safe against 100/second?
Maarten: Can always DoS, we have other ways of dealing with that
Gavin McCance: Is it designed to scale out?
Jeff: Three things in BDII don't want to replicate by hand
1 CE drain state? yes/no
Maarten: only ALICE use it
2 CE machine, list of endpoints, not in GOCDB.
Need to understand if this goes to GOCDB or to CRIC
3 ACLs for queues.
Jeff: Connector BDII -> GOCDB
Maria: Need to understand use cases,
Maarten: EGI use cases, not about dropping BDII, make us independent of it.
Jeff: Fine if independent of BDII as long as this doesn't increase load on site personnel
Simpler systems are intention, less work for sites.
Maarten: Important input (work for sites). What is in BDII is largely ignored. Idea was system fully dynamic. If today a queue has a different name, the original grid paradigm was: auto discovery -> auto use. However, a site may have multiple queues, not every queue suitable for every flow, so anyway have to discover which queue to use, by talking to the site admin. Existing practice, try to cast into sustainable official position now.
Maria: How to proceed?
Q: Effort coming from to sustain this? Is this going to require new component at sites?
Intended to be central system, Central CRICs etc.
Don't see sites have to install anything, maybe too early to say but that's the plan.
Ian C: all dynamic info, not GOCDB, rely on telephone and email?
Maria: Which info? Essentially all info under consideration is static, not dynamic
Ian C: If the only info sources are static?
only ALICE want dynamic, happy to continue using the BDII for that
Ian C: Queue names, ACLs, people ignore it so we rely on current situation
Michel: Tend to agree with Ian. Use of HTCondor bypasses dynamic info: will it also become a requirement for ALICE?
Maarten: Don't spend too much time on ALICE
Michel: Have backup plans for dynamic? are these putting requirements on sites?
Maarten: No backup plans for dynamic information, only talk about static information. New queue name, new VO ACL, usually require discussion with site + VO, been for 10 years. Obviously could try to implement fully dynamic system. Have tried to make reliable but fundamental battle. Even dynamic info is highly questionable much of time for many sites. In TF shall we continue on this path trying to make system reliable, or consider alternatives?
Michel: One info not mentioned, DIRAC for submitting jobs, how many jobs in queue?
Not through BDII, query the CE directly
Maarten: Main point was introduction of CRIC, can only bring benefits, finally have right technologies, frame of mind, positive thing. Can we depend less on other things. Don't forget this, discussion is difficult, that's why we have a TF, have been having these discussions for many months. This is what we came up with that has most traction, maybe not perfect but in complex system
Maria: ask for input on whether this is appropriate, info sources, getting data into CRIC, put in practice. Need to know if WLCG is fine with this, can keep discussing but need to move to move forward, this is a proposal. Need green light.
Julia Andreeva: ATLAS: rely mostly on AGIS, biggest info, if they succeed?
Michel: In ATLAS data AGIS collects include dynamic data. Can also in CRIC add things.
Maria: Slide 4, ATLAS doesn't use dynamic info in BDII
Peter Solagna: Would encourage fact that WLCG specific info, want to migrate BDII -> something else, encourage have them in a general purpose tool. Endpoints CEs, queues, in BDII, needed to submit jobs, if moved away from BDII if effort to move these, should move them to GOCDB
Maria: That's the idea. Did exercise with one UK site put minimum set of info in GOCDB, quite easy to add this today.
Ian Bird: For approval, needs to be put before Management Board (next week), need to continue with TF, edge cases, discussing dynamic and static queue name not dynamic even if changed every day. A lot of info never used, need something similar. One slide requesting approval with design of service etc.