WLCG GDB Action List

This page lists actions currently monitored by GDB and topics that will require actions in the future. Note that most actions related to operations are followed by the Operations Coordination.

General Actions

Machine/job features

  • Increase the adoption: identify sites ready to install the framwork developed, see December GDB
    • Contact Stefan Roiser

Security

Storage

Network

  • LHCONE: discuss a process for handling requests of new experiments who want to join LHCOPN

Belle2

  • Help establishing concrete links for collaboration (memory limitation, SAM, GGUS...) after June GDB

Other Items to Follow-up

Security:

  • Follow-up TEG recommendations for WN AAI and proposed focused WGs
    • Improved traceability: configuration recommendations for sites to meet the WLCG traceability requirements, take into account specific needs for virtualized WNs (including those running in external clouds)
    • Possibility to reduce proxy lifetime to 24h: preliminary analysis with experiments, milestone proposal (compatible with LS1?), implementation
    • Challenges and issues with respect to security in the usage of external/public clouds
  • Give feedback on documents about Security for Collaborating Infrastructures. See July GDB
  • Federated Identity Pilot: use for access to grid services (WLCG Pilot)

Miscellaneous

Actions on Hold

June 2015

  • GOCDB extension properties: VOs should report to WLCG IS TF envisioned use cases to define a consistent usage strategy
  • Establish a list of Class 2 services (VOBOX terminology) run by every VO (VOBOX and other services)
    • Initial list to be started by J. Templon with ATLAS information (June 2015)

January 2015

  • Raise awareness on bitcoin mining activities: lower activity in the last month
    • Discuss with EGI (and OSG?) common communication actions and documentation for sites
  • Multicore job support must be enabled at all sites
    • CMS: 50% of T1 capacity by end of January 2015

October 2014

  • Pool account recycling: extend the default grace periode to 6 months
    • Maarten in charge to pass/discuss requirement with EMI (June 2012)

June 2013

  • Multi-core jobs support in the BDII: focus on multi-core job for late binding (pilot jobs) through machine/job features mechanism
    • Start tests with volunteering sites (limited number) and hardcoded number of cores (suggestion 4)
    • Publication of multi-core capabilities in IS: final decision on open issues in last proposal made during the summer
      • Use a pragmatic approach: not looking for the perfect specification but the one allowing to handle the main use cases with the minimal amount of development (both for MW and experiment frameworks)

Actions Completed

September 2015

  • HS14: follow-up on progress (Spring 2015)
    • Both for a procurement/pledge benchmark and a fast benchmark (see Ocotober GDB)
    • HS06 must remain a 32-bit benchmark (see April GDB)
    • Also follow-up progress on the possibility of a "quick benchmark": see Helix Nebula experience
  • Define next steps: expected for September GDB
  • Last 10 sites not publishing properly number of cores should fix it or report problems (July 2015)
  • WLCG IS: give a try to ginfo (lcg-infosites replacement) and send feedback to project-grid-info-support@cernSPAMNOTNOSPAMPLEASE.ch
    • June 2015: ALICE may try it to access GLUE2Share on ARC CE
  • GSR on hold: AGIS evaluated as a potential alternative by CMS (2015?)
  • Discussion with OSG about using GLUE2 in the information provider of the new HTCondor CE
    • Also followed-up by the OpsCoord IS TF
  • RFC proxy migration: in progress, managed by EGI

July 2015

  • Each region must review the sites not publishing core count correctly: see John's email/report to GDB list on May 19, 2015
    • Particularly urgent: Italy, Spain, China, Asia-Pacific

June 2015

  • Send call for volunteers to host the next WLCG workshop (early 2016)
  • Check ARGUS and CREAM plans/timeline regarding EL7 support
  • HTCondor: follow-up issues with publication for CREAM (obsolete gip plugin) and ARC (missing VOView)
  • IPv6: perfSonar instance dual-stacked at each T1 by April 1st (not a joke!)
  • Create a table of storage implementations/protocols supported and experiments/used-usable protocols (see January 2015)

May 2015

March 2015

  • Discuss possibility of October GDB in the US, co-located with HEPiX at BNL
    • GDB week happens to be HEPiX week
  • perfSonar: reinstall instances asap (November) with v3.4 after Bash ShellShock vulnerabily
    • Deadine = February 2015, 16 but still many sites without a working perfSonar instance

February 2015

  • Accounting: follow-up APEL 'parallel' option enabled by default in future client release
    • Decided not to do it as this will have no impact on installed sites...
  • VOMRS -> VOMS-Admin: planned by February 2015, 3

January 2015

  • HS14 progress reviewed
  • Distribute WLCG site survey about operational costs. Deadline = December 19
    • Country coordinators to check with Andrea and Maria the sites who answered
  • ARGUS: outcome of December 11 meeting, clarification of future support
  • Review cloud adoption document and send feedback to Laurence.Field AT cern.ch
  • FTS Web portal: advertize beta service available at CERN and encourage feedback to fts-support@cernNOSPAMPLEASE.ch
    • Not restricted to LHC VOs: sites should feel free to encourage other VOs to test it

December 2014

  • WLCG monitoring: follow-up for discussion on submission timeout in February after moving to Condor submission
  • Endorsement of GEANT Data Protection Code of Conduct by WLCG? (See MB Action List, 140916-8)
  • SAM test failure after DPM 1.8.9 upgrade: need new SRM probe, expected by end of November

November 2014

  • ARGUS future and support: organize a meeting with developers
  • Data Preservation: feedback from experiments about interest for participating to a H2020 project
    • See October GDB, decision to be taken mid-November at the latest
  • Check new availability/reliability reports (default reports in November)
  • Feasibility to create a WG on traceability gap analysis in the cloud context

October 2014

  • SL/CentOS: follow-up initial discussion at each GDB until final decision at next HEPiX
  • CVMFS grid.cern.ch repository (MW clients): get feedback from LHCb

September 2014

  • Migration to GFAL2-based DM clients: experiments must update developers with their concrete plans
    • Decommissioning date: October 2014
  • MW baseline version enforcement: follow-up on the discussion at June GDB about Package Reporter vs. Pakiti
  • Storage protocols: pre-GDB scheduled in December
  • VOMS servers with SHA-2 certificates: wait for final fixes (June 14)
  • Federated Identity Pilot: implement it for web applications (first phase of pilot completed)

June 2014

  • Evaluation of MW clients in a CVMFS repository (repository name = grid.cern.ch) : GFAL2 clients and GLUE2 clients as a driver?
    • Repository reorganization to match AFS structure: check status with J. Bloomer
  • Machine/Job Features: looking for sites deploying producing the machine/job feature data
    • See April GDB
    • NDGF will find a site helping with SLURM integration (April GDB)
  • HS14: follow-up on progress at end of Spring (June)
  • Accounting assessment TF (J. Gordon): batch system experts are welcome to volunteer

May 2014

  • openssl vulnerability: sites must urgently do appropriate actions, as explained in alert
    • Not later than April 15
    • Potentially all SL6 systems, including < SL6.5 if errata were installed
  • Batch systems: sites should review summary table from March pre-GDB
  • SL/CentOS: follow-up initial discussion at each GDB until final decision at next HEPiX (May)
  • WLCG monitoring
    • Review progress with Condor-G submission (May?)
  • MW readiness testing
    • List of volunteering sites for the April MB
  • WLCG Monitoring: Present/discuss monitoring integration at site between WLCG monitoring and EGI monitoring

April 2014

  • Batch systems: build summary table from March pre-GDB (Michel/Alessandra)
  • CVMFS: sites must upgrade by March 1st (2014) to 2.1.15
  • perfSONAR: sites must upgrade by April 1st (2014) to 3.3.1
    • 9 problematic sites left
  • glexec and ARGUS deployment
    • 16 problematic sites left
  • Machine/job features on WN: package ready for deployment
  • WMS decommissionning at CERN in progress according to planned schedule

March 2014

  • HS14: follow-up for initial discussions in January

February 2014

  • Handling of job with high memory requirements: usability of PSS for implementing memory limitation, sharing of recipes
    • Jeff to present NIKHEF approach based on Torque at a future GDB (Feburary 2014)
  • Site monitoring/notifications
    • Encourage Site Nagios testing by more sites (PIC and NIKHEF ready to share their experience)
    • See presentation and documentation pointers in June 2013 [[https://twiki.cern.ch/twiki/bin/view/LCG/GDBMeetingNotes20130612#Site_Notifications_with_SAM_and][summary]
    • Review after progress with Monitoring Consilidation
  • SAM test scheduling: follow-up initial discussion about prioritizing SAM tests (February 2014)
  • Feedback and hosting proposals for a WLCG workshop beginning of July

January 2014

  • Security: review new security challenges brought by cloud infrastructures
  • Promote Davix library usage for Dav access
  • Machine/job features on WN: package ready for deployment
  • VOMS-Admin stability issue fixed: restarting tests to replace VOMRS
  • Accounting: check with EGI deployment plan for new APEL publishers, in particular StAR publishers.
    • Planned for January 2014 GDB

December 2013

  • Migration to GFAL2-based DM clients (see October 2013 GDB): review/finalize experiments's plans at December 2013 GDB
    • EGI input on non WLCG VOs welcome
  • WN SL6 migration
  • glexec: significant deployment achieved
  • SHA-2: experiment SW validated, services validated
  • BDII: top/site BDIIs must be upgraded to 5.2.21 (October 2013)

October 2013

  • Jobs with high-memory profile: NIKHEF working on a solution implementable at each (Torque?) site
  • Machine/job features on WN: work restarted after June GDB, in the context of the cloud work (graceful termination of VM)
    • Batch systems: concentrate on the "opportunistic" multi-core support use case
  • Review progress of storage-related WGs (Autumn 2013)

September 2013

  • Decide setup of an automated infrastructure for MW validation/testing

July 2013

  • Security
    • WLCG endorsement of new policies (Service Operations Security, new AUP): probably for MB
    • SHA-2: present a more detailed timeline (July)
      • Sites must schedule the update of EMI-2 services to appropriate update
      • Clarify dCache plans
    • More detailed presentation of SCI policy + future EGI potential role in it and its sustainability plans for this effort
  • MW provisionning/lifecycle
    • First review of EGI-driven MW provisionning process (September 2013)

June 2013

  • Machine/job features on WN: work resumed

May 2013

  • Site configuration management tools
    • Encourage sites to answer EGI survey
    • Discuss with HEPiX and EGI the setup of a Puppet users group: group started by HEPiX in Bologno
  • CVMFS: identify sites ready to evaluate new CVMFS version (2.1.5)
    • Main new features: NFS export, shared caches
  • SHA-2
    • Setup of a test SHA-2 CA at CERN, accredited by IGTF, for SHA-2 early testing

March 2013

  • Computing resources through clouds: build a work plan for future work
  • GLUE2
    • Follow-up progress on deployement of new tools, get experiment feedback on them: validator, ginfo
  • Review progress on StAR implementation
  • Review progress of storage-related WGs
  • Federated Pilot progress report
  • glexec : new milestones defined for wider deployment

February 2013

  • EMI migration completion (DPM, LFC, WN)

January 2013

  • Clouds/virtualization: discussion with experiments to identify the main use cases and define a work plan for future work after completion of the HEPiX WG
  • Report on discussion with experiments about IT/ES future after the end of EGI-Inspire
  • MW clients as a CVMFS repository, as presented at November GDB

December 2012

  • MW provisioning and validation process
    • Several comments received by Markus
  • EMI-2 migration
    • Most sites migrated their obsolete services (everything except DPM, LFC and WN)
    • Good collaboration between EGI and WLCG (Operations Coordination)
  • GLUE2
    • Rediscuss with providers, experiments and IS experts the need for GLUE2 support by all infrastructure, including OSG (see November GDB discussion)
  • Network
    • Rediscuss with providers, experiments and IS experts the need for GLUE2 support by all infrastructure, including OSG (see November GDB discussion)
    • perfSONAR: Monitor progress of mesh configuration support in perfSONAR-PS
  • DPM community: monitor agreed contributions after DPM workshop early December

November 2012

  • EMI-2 migration
    • Sites must either do the migration before end of October or communicate to EGI a reasonable migration plan/roadmap, indicating technical blockers, if any
    • Follow-up on the migration status at November GDB
  • BDII: discussion with most used top-level BDII about quality of service and service configuration
  • IPv6: update on testbed activities next autumn
    • Planned at November GDB

October 2012

  • OPS team initialization and kickoff, including focused task forces
  • WN environment information: proposal finalized and implementations ready for LSF, SGE, PBSPro and Torque
  • EMI-2 testing/validation
    • WN validated: known problem to be fixed in next EMI release (due Oct. 22)
    • No known issue with other services
    • Aggressive roadmap defined for migration: end of november for obsolete services, end of december for other pre-EMI services
  • GLUE2: clarify OSG plans regarding GLUE2 support as it is a critical piece in IS evolution planned by EMI/EGI
    • OSG has no plan as it identified no use case for BDII use in OSG future
  • Clarify expected/required storage/data interfaces: plan a pre-GDB to agree on requirements (October 2012)
    • Includes but is not restricted to SRM future discussion

September 2012

  • Clarify with EMI timescale for StAR implementations and possibility to test it with current SE product versions
  • SHA-2/RF proxies: get feedback from IGTF about delaying SHA-2 use until Jan. 2014
    • IGTF meeting on Sept. 10 agreed to postpone SHA-2 upgrade until August 2013, with a monthly review of deployment progress and blockers

July 2012

  • Review of TEG recommendations and decide WGs that need to be set up by WLCG
  • glexec and ARGUS deployment
    • Plan a more detailed presentation on how central banning works and how it can be used by non-ARGUS sites
  • SHA-2/RFC proxies: validate proposed milestones and review need for a plan B at July's GDB
  • Middleware support/development/provisioning after EMI
    • Attempt to organize a first meeting with EMI/EGI/OSG before mid-July (June 2012)
  • Operations Coordination Team set-up: discussion planned at July's GDB
  • DPM future: DPM community idea presented
  • pre-GDB to clarify WM TEG recommendations and establish concrete plans

Edit | Attach | Watch | Print version | History: r51 < r50 < r49 < r48 < r47 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r51 - 2015-10-10 - MichelJouvin
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback