Date and Location -------------------------

    1. July 2005
    2. :00 - 17:30

Attendees ---------------------------------

ASCC
Di Qing

BNL
FNAL
Jon

FZK
Jens Rehn

IN2P3

INFN
Luca, Michele Michelotto, Daniele Bonacorsi

SARA
Ron

NGDF
Lars

TRIUMF
Reda

PIC
Gonzalo, Jose Hernandez

RAL
Andrew Sansum, Martin, Derek

CERN
Jamie Shiers, James Casey, Gavin Mc Cance, Patrica Mendez Lorenzo, Sophie Lemaitre, Vlado Bahyl, Roberto

DESY
Michael Ernst, Patrick Fuhrmann

Expts
Lassi Tuura, Nick Brook, tim Barrass

Subject -------------------------------------

CERN Overview Not as far as we want, but now we've got the castor2 major problems solved - rate is back up .

Need to work on ASCC

RAL
Problem with gridftp servers hanging from time to time.

TRIUMF
Pool node hangs at 99% CPU on one transfer and accepts no more connections.. Are other people seeing that ?

Michael: can you send us the logs for that node.

SARA
Ron : We saw a lot of put entries - restarting didn't really help - we had to clean up the postgres database. Also we've had a pool fill up - seems we saw some other pool nodes freeze - didn't get enough info from the log files, but the debug limit is up again.

This morning we had a pool node that crashed twice since it ran out of memory when we tried to get up to 150MB/s. Now throttled back down.

Michael
What is running out of memory - the JVM?

Ron
Yes.

Lassi : See some problems with timeouts at RAL and CNAF

Derek
We allocated 12 TB, 10TB was already allocated.

Michael
10 streams per transfer - on average 8 transfers per pool nodes.

Jon
At FNAL we see 20 streams per transfer - 5 per pool with 2 pool partitions per system.

IN2P3
We had problems for the last two days - srmcp worked fine, but glite-url-copy works.

FZK
Jos: Now have transfers - 3 pool nodes only connected due to network problems at the FZK side.

NDGF

PIC
gonzalo: saw many problems yesterday - this morning started again - running smoothly.

James
Tuning might be different.

ASCC
Di: Network bandwidth very low. - only one machine being used - gridftp. Need to get SRM working.

Michael
We've seen some problems with individual nodes at the castorgridsc cluster - are these resolved.

James
Can we get a path at DESY we can write to - faster to fix it ourselves rather than round-tripping.

Jamie
won't talk about service phase now - we are focussing on throughput phase. But need to prepare for GDB - sites can co-ordinate with Jeremy/Jamie/Kors. One issue is that the sites will come back with their sample jobs for validation.

James
Load generator can be run from either end. If you want to run your own, feel free - fill in the table in SCThreeThroughputHowto to say if you want to run it yourself.

Jamie
No meeting next week due to the GDB - people can dial into that.
Edit | Attach | Watch | Print version | History: r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r1 - 2005-07-13 - unknown
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback