BNL configuration notes and deployment of dCache 1.8 patch level 13

The BNL's dCache 1.8 second instance dcsrmv2.usatlas.bnl.gov is upgraded to Patch 13. The following is the infrastructure of the system.

  • Admin node (dcache02.usatlas.bnl.gov)
    • Memory: 8G
    • Available disk: 2TB
    • Services: It hosts dCache admin components like PnfsManager, PoolManager and administrative interface etc. In the test bed, we did not separate PNFS from dcache admin node. So this machine host PNFS. For convince, we also installed dcap door in this node.

  • SRM door (dcsrmv2.usatlas.bnl.gov)
    • Memory: 8G
    • Available disk: 2TB
    • Services: SRM2.2, Utility and SRM database

  • GridFtp Door (dcdoor99.usatlas.bnl)
    • Memory: 5G
    • Available disk: 30G
    • The machine is out side of BNL firewall and have two NICs
    • Services: Grid ftp door with two interfaces

  • Read/Write pools (dc002.usatlas.bnl.gov)
    • Thumper with SunOS 5.10
    • Memory: 16288 Megabytes
    • Available disks: 16 TB
    • Services: Four dCache pools (read/write).

The system has HPSS as backend tape storatge system. The LinkGroup is considered Custodial. The PoolManager.conf file is attached.

  • PoolManager.conf: A file describe the link, linkgroup and pool relationship in dCache1.8

-- Yingzi (Iris) Wu - 24 Aug 2007

BNL configuration notes and deployment of dCache 1.8 patch level 1

This information is presented in two parts. The first part shows the current configuration of the BNL test point used on Flavia's tests. In the second part the experience with previous dCache 1.8 patch level 1 installation is presented.

Current BNL's dCache1.8 patch level 1 configuration

This is a stand alone dCache 1.8 installation all the dCache1.8 components run in the same server.

The storage class configured is :

REPLICA-ONLINE (Tape0Disk1)

The BNL's dCache 1.8 patch level 1 is installed on one server with the following specifications:

  • CPU speed 3400 MHz.
  • mem_total 4149240 KB.
  • Linux release 2.6.9-42.0.8.ELsmp.
  • This is a server located outside BNL's firewall not tape storage configured.
  • Disk space for storage 60GB.

General notes of dCache1.8 patch level 1 installation and configuration

I used the information provided from the dCache website, Timur's website and notes from previous stand alone dcache installation (1.7).

Attached to this page are the main configuration files I used to deploy the dCache1.8 patch level 1.

For dCache installation and configuration:

  • node_config
  • pool_path
  • dCacheSetup
  • dcachesrm-gplazma.policy
  • PoolManager.conf
  • srm.batch
  • utility.batch
  • Pool setup file (setup) same for all pools

Components of the dCache1.8 installation:

As it can be seen from the different configuration files I used,

  • 3 write pool of 20GB each one
  • gplazma is turn on
  • 1 gridftp door
  • 1 dcap
  • 1 GSIDCAP door
  • Pnfs
  • Admin cell

Configuration files customized for this installation:

Besides the parameter that needed to be ajusted to install dCache on one server and with the features mentioned before, the following parameters were changed:

dCacheSetup

  • srmCopyReqThreadPoolSize=12
  • remoteGsiftpMaxTransfers=12 (this is assuming that 4 gridftp transfer per pool and 3 pools so 4*3= 12)

PoolManager.conf

  • set timeout pool 240

Using the admin shell on the SRM cell I changed this two parameters:

(SRM-dct00) admin > set max ready get 20
(SRM-dct00) admin > set max ready put 20

References for installation

-General instructions followed from dcache BOOK and Timur's page. -SRM configuration http://home.fnal.gov/~timur/dCacheBook/cf-srm.html http://www.dcache.org/manuals/Book/cf-srm.shtml http://www.dcache.org/

Experience with dCache 1.8

Before the BNL test point passed the Flavia's tests, the following is a summary of the installation performance:

I could observe that it stayed on three states:

  • State 1: After a fresh reboot the system reached its best performance. Here the tests that failed were the following:
    • ReleasedFiles
    • Mv This test returned returnStatus=SRM_REQUEST_QUEUED

  • State 2: After a clear start and the system working for more than 10 hours the number of test that returned SRM_REQUEST_QUEUED increased:
    • 06_StatusOfBringOnlineRequest
    • 09_ReleaseFiles
    • 02_StatusOfPutRequest
    • 04_PutDone
    • 05_PrepareToGet
    • 05_StatusOfGetRequest

  • State 3: The test reported globus-url-copy failed 137.
    • 05_StatusOfGetRequest
    • 06_BringOnline
    • 06_StatusOfBringOnlineRequest
    • 09_ReleaseFiles

Tracing a specific test when reported SRM_REQUEST_QUEUED

Looking into different log files on dcache such as catalina.out, and traicing the file used to perform this test, I found the following information:

5_StatusOfGetRequest: Executing srmPrepareToPut, putRequestToken=-2147472416
05_StatusOfGetRequest: fileRequests.expectedFileSize[{2691 }]
05_StatusOfGetRequest: desiredFileStorageType=PERMANENT
05_StatusOfGetRequest: srmPrepareToPut, returnStatus=SRM_REQUEST_QUEUED
05_StatusOfGetRequest: srmStatusOfPutRequest, returnStatus=SRM_SUCCESS
05_StatusOfGetRequest: srmStatusOfPutRequest, remainingTotalRequestTime=
05_StatusOfGetRequest: srmPutDone, fileStatuses=surl0=srm://dct00.usatlas.bnl.gov:8443/srm/managerv2?SFN=//pnfs/usatlas.bnl.gov/data/dteam/20070524-210113-28241-0.txt returnStatus.explanation0=Done returnStatus.statusCode0=SRM_SUCCESS
05_StatusOfGetRequest: Put cycle succeeded
05_StatusOfGetRequest: srmPrepareToGet, getRequestToken=-2147472414
By looking at the pnfs id asinged to this file 000100000000000000095F48 (/pnfs/usatlas.bnl.gov/data/dteam/20070524-210113-28241-0.txt) on the srm log, it should be possible to locate the file on a particular pool:


(PnfsManager) admin > cacheinfoof 000100000000000000095F48
cacheinfoof 000100000000000000095F48

No pool was returned

However, the file does exist on the pool: [root@dct00 data]# pwd


/data/data5/dcache_pool_5/pool/data
[root@dct00 data]# ls -l 000100000000000000095F48
-rw-r--r--  1 root root 2691 May 24 15:01 000100000000000000095F48

Then looking in the admin on the pool

(dct00_5) admin > rep ls -l 000100000000000000095F48
rep ls -l 000100000000000000095F48
000100000000000000095F48 <C-------X--(0)[0]> 2691 si={myStore:STRING}


Reinstallation of the dCache1.8 patch level 1

In order to have a clear and fresh installation of the different dcache components I decided to reinstall dcache1.8 patch level 1 databases, pnfs. Nevertheless, I kept previous dcache configuration files and used them to configure the new installation changing the following:

Changes on the configuration for the new deployment of dCache 1.8

  • Reduce write pools from 5 to 3 units.
  • Increased the timeout pool from 120 to 240.
  • srmCopyReqThreadPoolSize to 12 (asuming 4 gridftp transfer per pool and 3 pools, so 4*3= 12); before this parameter was 25 assuming 5 gridftp transfer per pool and with 5 pools.
  • remoteGsiftpMaxTransfers=12; before 25
  • maxReadyJobs=20; before 25.
  • 30 mover queue / per pool. The previous installation had 5 pools I used 18 per pool.

By applying this changes the test point passed Flavias tests. It seems to me the problem consisted on tunning up the system according of the test requests to avoid consecutive requests staying on queue.

-- Main.cgamboa - 21 Jun 2007

Topic attachments
I Attachment History Action Size Date Who Comment
Unknown file formatconf PoolManager.conf r1 manage 2.6 K 2007-08-24 - 18:13 UnknownUser A file describe the link, linkgroup and pool relationship in dCache1.8
Compressed Zip archivetar dct00.bnl.configuration.files.tar r1 manage 60.0 K 2007-06-21 - 20:32 UnknownUser Files used to configured the BNL dcache 1.8 test point.
Edit | Attach | Watch | Print version | History: r6 < r5 < r4 < r3 < r2 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r6 - 2007-08-24 - CarlosGamboa
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback