Summary of January GDB, January 16th, 2019 (CERN)
*DRAFT*
Agenda
Agenda
Introduction - Ian Collier
slides
The GDB started with a quick overview of the agenda and the upcoming meetings and events. Please keep in mind that the April and May GDBs will be colocated with ISGC and the EGI conferences respectively.
Concerning the WLCG-HSF-OSG Workshop Ian pointed out that the hotel offers arranged by the organisers are quite advantageous. When planning your trip the ride from and to the airport has to be considered too - there is a google spreadsheet available to assist with ride sharing.
Ian encouraged participation in the upcoming Benchmarking WG meeting on Friday 18th. Especially experiment workload expertise would be helpful at this meeting.
It is planned to co-locate a WLCG workshop with CHEP 2019 in Adelaide.
CERN-SKA Collaboration - Ian Bird
slides
Ian Bird gave an overview presentation of the CERN-SKA collaboration.
After more than a decade of advising/reviewing and ad-hoc consultations a formal collaboration agreement between CERN and SKAO has been signed in July 2017.
Many similarities in the scale of data processing and organisational structures have been identified. Synergies are expected to arise from complementary expertise. Regular meetings and joint roadmap papers for scientific data management are some of the activities stimulated by the collaboration agreement, as are shared projects within the Openlab framework.
The ESCAPE project is the quintessence of these developments. Covering aspects for both communities in the areas like:
- Regional Centres
- HPC - PRACE
- Networking
- Software methodology and performance
- OpenStack cloud enhancements
A remotely connected participant remarked that there are many common activities in the UK between WLCG and SKA.
Ian replied that the collaboration agreement is between CERN and the SKA project office.
A person in the audience questioned the selection of topics of ESCAPE, especially the approach to work on concrete, existing tools, such as
OpenStack, without in depth discussions of the requirements.
In his responds Ian pointed out that ESCAPE will cover this, especially by building a prototype, based on data from existing telescopes, the project will understand many aspects of the proposed regional centres.
Someone in the audience added that the common lobbying for a joint infrastructure is also an important aspect.
Ian supported this and added that we have done this with PRACE and the exascale projects, which hasn’t been always easy.
IPv6 Report - Andrea Sciaba
slides
Andrea Sciaba gave an update on the status of the IPv6 deployment on WLCG
The deadlines for deployment set by the timelines for T1s and T2s have passed and the goals haven’t be met.
Nevertheless massive progress has been made. The IPv6 traffic increased since September by 47% and from CERN to the T1s 50% of the traffic is IPv6 related.
All T1s on LHCOPN are offering IPv6, with the exception of one site. Most other T1 services are OK. Communication to the T2s is via GGUS tickets and via direct contact for US based T2s. 53 % of the T2 storage is already IPv6 ready. This is good progress. Most problems are not due to the sites, but related to required network interventions, out of control of the sites’ teams.
Since the progress has been steady Andrea predicts that we will reach 60% within the next days, 75% in weeks and 90% within a few months.
He also pointed out that for CMS 75% of their storage is already on IPV6.
There was a discussion on a deadline extension. The conclusion was that there is no reason for a formal new deadline, as long as we keep the pressure high.
NDGF suggested that one year after the deadline IPv4 only sites could be declared to be 100% non-available.
Andrea replied that the experiments could do this and that some of them consider this, but this decision has to be taken by the experiments, not by WLCG.
Maarten clarified that in principle ALICE can make use of IPv6 only when all sites are ready, but it is not an issue as long as deployment is close to 100%, especially when only smaller sites are late.
Dave Kelsey thanked Andrea for his great work and expressed a certain level of disappointment by some T1s and the fact that there is still usage of IPv4 between IPV6 sites.
ATLAS pointed out that the task force should make sure that sites that are not used anymore should be removed from the list.
Andrea confirmed that to his knowledge they have been removed.
It was noted that some sites that worked in the past do not always continue to work which results in a sizeable workload for the shifters.
Andrea explained that the activity is about the first deployment, tickets are closed when the IPv6 services work for the first time. From then on failing IPv6 services are standard operations issues, covered by the standard procedures.
Ian C. stressed hat the ticket system should be used vigorously, since it is very important to collect reliable information on existing issues.
Ian C. thanked Andrea for his outstanding work and stressed that he has done this mostly alone.
WLCG Storage Space Accounting - Dimitrios Christidis
slides
Some Questionss and discussion on dashboard being slow to update
Should be better after first access - things have to propagate through system
Raise JIRA ticket if problems persist
Monitoring developments - Borja Garrido Bear
slides
Report on developments in the monitoring framework for CERN data centres and WLCG
DOMA QoS report - Paul Millar
slides
Survey coming soon
Call for more participation
Maarten: for ALICE v important that xrootd be enhanced with these concepts - are xrootd people on board?
PM: does not yet have a contact person - not essential yet - not at technology level
M: there will be at least a few storage developers on board already thinking about implications - Don’t wait too long to invite them
PM: note XDC logo - the XDC H2020 project has broad participation of relevant interested parties - should pull in all the people involved
Concezio Bozzi - (new LHCb computing coordinator): of course we are very much interested. Our Data management team is quite thin - busy with upgrades. Am looking for people to participate
PM: Useful for people to attend when they can - does not need to be every meeting to ensure issues and use cases are highlighted.
CB: Not clear to me what ‘buffer storage’ is. Only copy before replication? Both output, custodial, fast… How does this fit in straw man model?
PM: good point. Could incorporate as a new
QoS expectation.
One of main motivations is to foster experiment and trading possibilities. If buffer is data coming form on-line - you may be able to use
QoS labels to define requirements.
ML: T0 will always be special.
Oliver Keeble: we understand commitment to come to every meeting may be too much - any attendance welcome.
IC: Good start at defining a richer language to describe storage types. Would be surprising if it had captured everything right away.
OSG Cache on Internet Backbone developments - Edgar Fajardo Hernandez
slides
x-rootd caches on internet backbones
Maarten Litmaath: This is mostly focused on ‘other communities’
EFH: Atlas has its own similar project led by Rob Gardner at
UoC
ML: In other regions we should take a good look and see if we can help this regions become the ’netflix of science’
EFH: at KISTI we also have some PRP & LIGO presence
ML: for ATLAS & CMS there are indeed similar project and big players may have not quite the same infrastructure. Other communities would only be on teh rise.
EFH: Good for ‘in the middle’ groups - 10k jobs & ~20 TB
Dave Britton: Need to reconcile difference with CDN and data lake - would it make sense for atlas to keep all data ‘in flight on the network”
ML: these experiments have been successful in OSG for some years.
Johannes Elmsheuser: For ATLAS - this caching comes in more when data is reused for eg analysis
--
IanCollier - 2019-01-28