All About REBUS

REBUS is the Resource, Balance and Usage website for the whole of WLCG, including topology information, resource pledges, and installed capacities.

Topology Information

REBUS should be the authoritative source to know what sites are part of WLCG.

Federations and Sites are manually managed in REBUS by WLCG Office.

A Federation is entered in REBUS as soon as a MoU has been signed. The Federation informs WLCG Office by email which sites are to be included in the Federation (this may happen at the same time, or a few weeks later).

Sites can be decommissioned by a Federation, in which case the Federation should inform WLCG Office as soon as possible to retire the site in the active REBUS list. This is to prevent misleading reporting and metrics.

Resource Pledges

Resource pledges are collected once a year - all federations are mailed by WLCG Office around late June/early July. The data is manually added to REBUS by the Federation directly (access privileges are managed by e-group).

The results are made into a report, which is presented at the Autumn (~October) Computing Resources Review Board. It is scrutinized by the Experiments and the Funding Agencies, and is a reference for resources officially available to WLCG.

The Resource Pledges report is used as Annex 6 in the MoU.

  • Federation Resources: it contains disk, tape and HS06 pledges per federation. The view shows what percentage of the requested capacity is achieved by the pledge. It offers a historical yearly view.
  • VO Requirements: These are extracted from the Computing Scrutiny Group's report presented at the Spring (~April) Computing Resources Review Board. The requirements are added to every year (ie when the Experiments have confirmed and agreed their following year's requirements). The view shows total requested capacity per Tier for CPU, disk and tape (only T0 and T1). It offers a historical yearly view.
  • Pledge Summary: a table displaying the federation pledges against the VO requirements
  • Pledge Validation: this allows the federations to enter pledge information with associated comments for each of them. It is only accesible to the site responsibles who have the necessary edit rights in REBUS.

Installed Capacities

Installed Capacities are taken directly from the top BDII which is queried every hour. Different views are displayed:

  • Federation Capacities: this is an aggregation of the installed capacities published by the sites within the federation.
  • Site Capacities: this view shows two possible values for Logical CPU, Physical CPU, HS06, total online storage and total nearline storage.
    • For current month and year, it shows what it is displayed in the BDII (See REBUS known issues for more details)
    • For anything else in the past, it shows a monthly average value.
  • VO shares: for each site, it shows site capacity published for each VO. This requires sites to properly publish VOView information in the BDII. Otherwise, the capacity displays 0.
  • Capacity and Pledge comparison: for each site, it compares the capacity of the site as published by the BDII to the one reported in the pledges view
  • Storage Capacity comparison: for each site, it compares the total storage with the total installed storage for both online and nearline storage.

Accounting Reports

Accounting reports are generated once a month. The reports can be found in the WLCG Document Repository.

Tier 1

The Tier1 accounting report is sent to the WLCG Management Board. It is also presented to the twice-yearly Computing Resource Review Board, where it is scrutinized by the Funding Agencies.

REBUS is used as the input form for Tier 1 accounting to generate the final reports. The final reports are visualized in Excel and distributed as a pdf.

The REBUS input form is pre-filled automatically from the APEL/Accounting portal and with manual data from previous reports as shown in the picture. rebus.png

  • Installed capacity is provided manually by the site - data automatically carried over from previous month's report.
  • Grid CPU and Grid Wall are extracted from APEL/Accounting portal
  • Non-Grid is provided manually by site
  • Disk Usage and Tape Used are provided manually by the site
  • Disk Allocated is carried over from the previous month's report, where it was manually provided
  • All data can be (and is intended to be) manually modified. (NB one thing that is still missing is a 'historique' of what was changed)

Excel is used to visualize the report; a web query picks up the accounting data, comments and the pledges stored in REBUS, to produce the pdf accounting report.

Any data provided here only feeds into the monthly pdf accounting report - none is fed-back into APEL.

Tier 2

The Tier 2 accounting reports are sent to the Management Board and to the Tier 2 sites admins/responsibles.

The data is directly extracted from the Tier 2 APEL/Accounting portal.

The data is pulled into an Excel spreadsheet (via 3 spreadsheets which provide logic checks on the data), and published as a pdf.

Sites sometimes mail to the LCG Project Office to request updates to this spreadsheet; no data is fed back to APEL. However, normally any significant changes will involve APEL, as this indicates an issue in data transmission between the site and the Portal.

Trends

REBUS can produce accounting plots for these quantities:

  • CPU time
  • Wallclock time
  • CPU/WC efficiency
  • Percentage of pledged hours used (using wallclock time)

The data is originating from the EGI accounting portal. The time granularity is monthly and the latest pledges in REBUS at a given month are used fo calculate the percentage (for example, in 2015 the 2015 pledges are used starting from April).

REBUS web application and codebase

REBUS known issues

Capacities known issues

The following list of known issues describes REBUS internal logic used to calculate capacities. Due to this logic, BDII and REBUS values doesn't always match. This inconsistency creates a lot of confussion among sys admins that do not understand where REBUS numbers are coming from. For this reason and also because these values in REBUS are not actually used by any LHC VO, the IS TF is proposing to remove the installed capacities view from REBUS.

Physical CPUs, Logical CPUs and HEPSPEC

  • The total number of Physical CPUs, Logical CPUs and HEPSPEC in a site is calculated using the numbers published in the following GLUE 1.3 variables:
    • GlueSubClusterLogicalCPUs
    • GlueSubClusterPhysicalCPUs
    • GlueProcessorOtherDescription
  • The total numbers are calculated adding the values of these variables for all the GlueSubcluster objects published by the site in the BDII. However, if two or more subclusters publish the same number of logical CPUs, the values for each of the three variables are taken into account only once, as REBUS considers that the subcluster is doublecounting resources.
  • The exact piece of code in REBUS taking care of this is available here (lines 251-255).

Total Online and Nearline Storage

  • The total online and nearline storage in a site is calculated using the numbers published in the following GLUE 1.3 variables:
    • GlueSETotalOnlineSize
    • GlueSETotalNearlineSize
  • The total numbers are calculated adding the values of these variables for all the GlueSE objects published by the site in the BDII. REBUS checks the GlueForeignKey attribute in GlueSE describing the site name, and it only takes the GlueSE into account if it matches the GOCDB/OIM site name.
  • The exact piece of code in REBUS taking care of this is available here (lines 267-270).
  • This explains inconsistencies in sites like UKI-SOUTHGRID-OX-HEP as described in GGUS:121641.

Sites appearing in both lcg-bdii and OIM

  • REBUS logic checks the following sources of information to calculate HEPSPEC, Total Online and Nearline Storage:
    • First: lcg-bdii
    • Second: OIM using the following feed
  • If a site is defined in both places, the values obtained in the lcg-bdii will be overwritten by the values available in OIM.
  • This explains inconsistencies in sites like UNIBE-LHEP that is published in both lcg-bdii and OIM, but with wrong information in OIM, as described in INC:0997821.

Physical CPUs and Logical CPUs in OSG sites

  • Physical CPUs and Logical CPUs are not taken into account for OSG sites.
  • These values are not published in the OIM feed
  • For this reason, REBUS code sets them to 0 (like USCMS-FNAL-WC1).
  • The exact piece of code in REBUS taking care of this is available here (lines 128-129).
  • Some OSG sites (like BNL-ATLAS) do publish a value for Physical CPUs and Logical CPUs different than 0. The reason is because these sites publish these variables in lcg-bdii, as explained in the previous section.

-- MariaALANDESPRADILLO - 2015-07-15

Topic attachments
I Attachment History Action Size Date Who Comment
PNGpng rebus.png r1 manage 58.3 K 2015-07-15 - 09:57 MariaALANDESPRADILLO  
Edit | Attach | Watch | Print version | History: r24 < r23 < r22 < r21 < r20 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r24 - 2016-06-08 - MariaALANDESPRADILLO
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    EGEE All webs login

This site is powered by the TWiki collaboration platform Powered by Perl This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Ask a support question or Send feedback