My collection point for HADOOP-related projects around CERN-IT

(personal, no official info here. "real" page is at ItHadoop)

Infrastructure

  • old "ahc" cluster in CDB:
  • "hadoop" cluster in Puppet: Cloudera CDH4.1.2
    • Namenode: lxbrf39c04
    • see "CDBHosts --puppet_hostgroup=hadoop/datanode" for rest
  • IT-DB ex-RAC7

Projects / Test

The WLCG Database TEG in their final report (Apr 2012) made the recommendation to IT to
".. deploy a suitably sized Hadoop cluster, [..and ] Hadoop clients, including pig/hive should be made available on user interfaces(lxplus), together with a reasonably sized HBase installation. We make no operational requirements on the cluster [..]"
(This TEG was chaired by Dario, so some overlap with ATLAS wishes)

  • chat with IanB: under which conditions could IT offer this as a service?
    • clearly manpower-limited - application level support & up must be with the experiment.
    • split instances might work (self-serve model for new Hadoop instances), as long as not 1 SM per instance.. and as long as the resources come from the experiment allocation (probably wall-clock time for CPU, while the instance is up, plus reserved storage)
    • ideal: shared instance, e.g. on batch nodes (use some cores for Hadoop, rest for bacth - closer to sweet spot for Hadoop). Need to see which productionizing steps are required, and whether (inevitable) clashes/overlapping utilization are harmful
      • security on
      • accounting - need to report the actual use for storage and CPU (and ideally remove from CASTOR/EOS and LXBATCH quota)

ATLAS

ATLAS formally (B.Kersevan+H. von der Schmitt memo to I.Bird, 2013-01-30) asked for IT support around Hadoop, initially for prototyping, in 3 areas:
  • PanDA logging: Archival and querying of job and file records in Panda (in development)
  • EventIndex (starting design)
  • Distributed Data Management (DDM) accounting and related activities (already in test production),

ATLAS-TAG

ATLAS-DDM

(see memo for current setup)

IT-internal

AI monitoring/GNI

  • Pedro/Miguel

CASTOR logviewer

OpenLab

  • Bob: interest expressed by 2 partners, no project yet
  • initial testing by Maaike (openlab fellow, worked with IT-DB/Oracle): "physics analysis inside databases"
Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2013-05-28 - JanIven
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Main All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback