Week of 130408

Daily WLCG Operations Call details
WLCG Availability, Service Incidents, Broadcasts, Operations Web
General Information
Monday
Thursday

Daily WLCG Operations Call details

To join the call, at 15.00 CE(S)T Monday to Friday inclusive (in CERN 513 R-068) do one of the following:

Dial +41227676000 (Main) and enter access code 0119168, or
To have the system call you, click here

The scod rota for the next few weeks is at ScodRota

WLCG Availability, Service Incidents, Broadcasts, Operations Web

VO Summaries of Site Usability				SIRs	Broadcasts	Operations Web
ALICE	ATLAS	CMS	LHCb	WLCG Service Incident Reports	Broadcast archive	Operations Web

General Information

General Information			GGUS Information	LHC Machine Information
CERN IT status board	WLCG Baseline Versions	WLCG Blogs	GgusInformation	Sharepoint site - LHC Page 1

Monday

Attendance:

local: Alessandro, Belinda, Jarka, Maarten, Maria D, Simone
remote: David, Joel, Kyle, Lisa, Michael, Onno, Pepe, Rolf, Salvatore, Thomas, Tiju, Wei-Jen, Xavier

Experiments round table:

ATLAS reports (raw view) -
- T0
  - GGUS:92166 (transfers to CERN failing with "Error with credential") still open and creating troubles. The issue has been open at the beginning of march, has been intermittent, never really understood AFAIK. CMS did observe the same issue at some point. ATLAS updated the ticket today with the most recent failures. Please investigate.
    - Maarten: the expert was overloaded with other urgent matters and then away on holidays; will follow up offline
    - Simone: the matter has not been critical because the transfers usually make it OK later, but the monitoring very often has a lot of red, which makes other issues difficult to spot
- T1s
  - Issue with file staging at CNAF. GGUS:93165 has been submitted.

CMS reports (raw view) -
- LHC / CMS
  - Rereconstruction of 2012 data in the tails, load at the T1 sites small
- CERN / central services and T0
  - We are beginning to treat CERN more as a T1 in terms of transfers, processing
- Tier-1:
  - IN2P3 Hammercloud and SAM test failures over weekend seems solved but (as of last night) GGUS tickets still open
    - GGUS:93161
    - GGUS:93163
- Tier-2:
  - ntr

ALICE -
- NTR

LHCb reports (raw view) -
- Mainly user jobs with some MC ongoing.
- T0:
  - SLS sensor for LHCb LFC flickering every 5 minutes.
- T1:

Sites / Services round table:

ASGC - ntr
BNL - ntr
CNAF
- the issue reported by ATLAS appears to be due to the StoRM configuration, the ticket should soon be updated
FNAL - ntr
IN2P3 - ntr
KIT
- the issues with ATLAS storage reported last week are not yet resolved and the cause is still unknown; a downtime may be needed for updating GPFS
NDGF - ntr
NLT1 - ntr
OSG - ntr
PIC - ntr
RAL
- at-risk downtime tomorrow morning for network maintenance, 2 short breaks expected, FTS will be drained beforehand

dashboards - ntr
GGUS/SNOW - ntr
storage
- the CASTOR file update feature has been disabled today as announced

AOB:

Thursday

Attendance:

local: Alex, Jarka, Joel, Luca M, Maarten, Simone
remote: Gareth, Jeff, Kyle, Lisa, Lucia, Michael, Rolf, Stefano, Thomas, Wei-Jen, Xavier

Experiments round table:

ATLAS reports (raw view) -
- NTR

CMS reports (raw view) -
- LHC / CMS
  - Rereconstruction of 2012 data in the tails, load at the T1 sites small. User's analysis goes on at constant pace.
- CERN / central services and T0
  - We are beginning to treat CERN more as a T1 in terms of transfers, processing
- Tier-1:
  - ntr
- Tier-2:
  - ntr

ALICE -
- NTR

LHCb reports (raw view) -
- Mainly user jobs with some MC ongoing.
- T0:
  - SLS sensor for LHCb LFC flickering every 5 minutes. (RQF:0190901)
- T1:

Sites / Services round table:

ASGC
- network uplink interrupted Tue afternoon, fixed Tue night
BNL - ntr
CNAF - ntr
FNAL - ntr
IN2P3
- local ALICE contact reported MonALISA tests being in error
  - Maarten: the job numbers are known to be underreported, that will be debugged
  - after the meeting: the issue with job numbers remains to be fixed, while test results look OK
KIT - ntr
NDGF - ntr
NLT1 - ntr
OSG - ntr
RAL
- yesterday's planned intervention affecting FTS went OK

dashboards - ntr
grid services
- upgrade of WMS+LB servers to EMI3 next Tuesday. https://itssb.web.cern.ch/planned-intervention/upgrade-wms-servers-emi-3/16-04-2013
  - includes extra Condor-G upgrade to support ARC-CE 1.x and 2.x (for CMS)
storage
- Apr 18 10:00 proposed for EOS-ALICE upgrade with ~30 min downtime

AOB:

Topic revision: r7 - 2013-04-11 - MaartenLitmaath

LCG Wikis

LCG Service
Coordination

LCG Grid
Deployment

LCG
Apps Area

Public webs

Welcome Guest

- Cern Search
- TWiki Search
- Google Search
LCG All webs

Copyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback