WLCG Operations Coordination Minutes - December 18th, 2014

Agenda

Attendance

  • local: Maria Dimou (chair), Nicolò Magini (secretary), Ulrich Schwickerath (Tier-0), Stefan Roiser (LHCb), Andrea Manzi (MW Officer), Luca Canali (IT-DB), Maarten Litmaath (ALICE), Alessandro Di Girolamo (ATLAS)
  • remote: Burt Holzman (FNAL), Cristoph Wissing (CMS), Di Qing (TRIUMF), Jeremy Coles (GridPP), Yuri Lazin (NRC-KI-T1), Michael Ernst (BNL), Rob Quick (OSG), Ulf Tigerstedt (NDGF)

Operations News

  • Reminder: Instructions for sites to enable Multicore accounting details in MB slides:
    • EMI-3 CREAMs have to enable multicore support (Edit /etc/apel/parser.cfg and set the attribute parallel=true.)
    • Sites using SSM1.2 should move to SSM2
    • Sites using DGAS should move to use the APEL client
  • Reminder: Site managers please fill the http://cern.ch/wlcg-survey
  • The updated values for Tier0 Critical Services are ready and will be presented by Andrea S. today
  • The ARGUS Workshop of Dec 11th at CERN Agenda was a success. Participants agreed to meet again next Feb and prepare a collaboration for ARGUS maintenance between now and then. The MW Readiness WG will participate in the ARGUS testing under load and/or with peculiar CA attributes.
  • There should be no worries for the coming months as far as WLCG support from sites belonging to NGIs leaving EGI.

Middleware News

  • Baselines:
    • As already broadcasted to sites, vulnerabilities have been discovered in FTS3 and gfal2. The latest available version 3.2.30 and gfal2 2.7.8 are not vulnerable, so they have been set as baselines and sites are suggested to upgrade. Both version will be also available in UMD by the middle of January ( for FTS3 is the first time in UMD)
    • added dCache 2.11.4 ( already deployed at KIT)
    • xrootd4.1.1 is in EPEL5 and EPEL6 testing. This is needed for DPM sites to move to xrootd4, a new version of dpm-xrootd is under preparation.
  • MW Issues:
    • https://rhn.redhat.com/errata/RHSA-2014-1997.html, RHEL6 vulnerability , new kernel available, WN update has been requested by EGI security.
    • missing dcache-xrootd plugins in WLCG repo
      • CMS-TFC is available from dcache.org but it would be nice to have it also in WLCG repo ( KIT and PIC are pushing for it)
      • latest dcache-xrootd-n2n( 6.0.5) and xrootd-monitor (5.0.9 ) needed for dCache 2.10.x are still not available in WLCG repo
  • T0 and T1 services
    • FNAL
      • Backported dCache 2.6 changes to dCache 2.2
      • EOS upgraded to 0.3.53
      • FTS upgraded to 3.2.30
    • KIT
      • installed FTS3 3.2.30 ( for non LHC VOs, LHC VOs are enabled for emergency use)
      • dCache upgraded to 2.11.4

  • Jeremy Coles posts in Vidyo chat: An update for RAL T1: there is now a planned outage for the entire RAL Tier-1 batch farm declared from 09:00 to 13:00 on Tuesday 23rd December. This is to pick up the new kernel following EGI-ADV-20141217 - any running jobs will be lost.

Oracle Deployment

  • Luca Canali reports that the interventions for 2014 are concluded. Since no major interventions are planned during 2015, agreed to remove the "Oracle Deployment" section from the agenda in 2015.

Tier 0 News

  • Ulrich Schwickerath reports two items for Tier-0 on behalf of Maite Barroso and Manuel Guijarro:

  • Alberto Peon asked each experiment to provide a list of "expert users" to participate in voms-admin testing; so far only LHCb has answered.
  • Maria Dimou reminds of the existing ticket GGUS:110227 with the feature requests from the experiments

  • The proposed date for the decommissioning of the AFS UI is Monday, February 2nd 2015.
  • Stefan Roiser asks if the list of users of the AFS UI can be generated weekly to check the progress; also if a separate list can be generated for the UI and the certificates directory. Ulrich will forward the requests to the service managers.

Tier 1 Feedback

Tier 2 Feedback

Experiments Reports

ALICE

  • high activity during the last 2 weeks
  • expectations for the end-of-year break:
    • mostly steady MC production
    • some raw data reprocessing
      • data being staged, no tape activity foreseen during the break
    • low analysis activity
  • ARC CE SAM tests:
    • direct job submission probe seems to have/provoke a memory leak; not yet debugged

ATLAS

CMS

  • Production/Processing overview
    • Tier-1: DIGI-RECO campaigns Phys14DR, Fall14dr, Summer12DR53x
    • Tier-2: Various MC productions: Mainly Run2
  • Productions expected to continue over the Xmas/New Year break
    • Quite some work launched recently
  • Had some instabilities with our web services last week and the week before
  • 50% of Tier-1 capacity multi-core enabled in January
    • If site has dedicated multi-core resources, it should provide this fraction
    • Will start from functional tests and scale to the planned level
    • Will be partly used in "partitional slot mode" (Running n single-core jobs in n core multi-core pilot)
    • Long lifetime of pilots preferred -- what is still feasible for the sites?
  • In progress of moving CRAB and central production into a single global Condor pool
  • Opened quite some tickets for sites that had not yet updated their local site configuration with the "Phedex Node Name"

  • Cristoph Wissing comments that the tickets for "PhEDEx Node Name" are not urgent so there's no need to escalate them during the holidays.

LHCb

  • "Run 1 Legacy Stripping" campaign is progressing well
    • Decision to be taken tomorrow but probably will run over the Xmas break
    • In best case will finish in ~ 4 weeks
  • HTTP/WEBDAV site accesses
    • 5 sites missing, out of those 3 have provided access points that need to be verified.

WLCG critical services

Ongoing Task Forces and Working Groups

gLExec Deployment TF

  • gLExec in PanDA:
    • testing campaign: 11 sites added (43 total)
    • issues at a few sites being investigated

Machine/Job Features

Middleware Readiness WG

  • The MW Readiness WG status presentation by the MW officer at the December 2014 GDB gives the status of the effort up to last week.
  • A slow-down in participation by sites is observed.
  • The MW Readiness WG will participate in the ARGUS testing under load and/or with peculiar CA attributes.

  • Maarten Litmaath reminds that the stress tests of ARGUS were performed many years ago; the MW readiness WG wants to repeat the testing in the current conditions, also in light of the not fully understood issues seen during 2014 at large sites like CERN. It will be a large scale testing infrastructure unlike the other services.

Multicore Deployment

  • NTR

IPv6 Validation and Deployment TF

  • NR

Squid Monitoring and HTTP Proxy Discovery TFs

  • Nothing new to report

Network and Transfer Metrics WG

  • NR

Action list

  • ONGOING on the experiments: check the usage statistics of the AFS UI, report on the use cases at the next meeting.
    • Tier-0 to provide full usage statistics weekly if possible; separately for the AFS UI and the AFS certificates dir.
  • ONGOING on the WLCG monitoring team: evaluate whether SAM works fine with HTCondor CE. Status: HT-Condor CE tests enabled in production on SAM CMS; sites publishing sam_uri in OIM will be tested via HTCondor (all others via GRAM). Number of CMS sites publishing HTCondor-CE is increasing.
    • Ongoing discussions on publication in AGIS for ATLAS.
  • ONGOING on experiment representatives - report on voms-admin test feedback
    • Experiment feedback and feature requests collected in GGUS:110227
  • ONGOING on Andrea Sciabà - review the critical services table

AOB

  • Next VIRTUAL meeting on January 8th, 2015.
  • Next regular meeting on January 22nd, 2015.

-- NicoloMagini - 2014-12-18

Edit | Attach | Watch | Print version | History: r18 < r17 < r16 < r15 < r14 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r18 - 2018-02-28 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback