Overview

This page documents the policy to request re-computation of SAM3 availabilities results for WLCG.

How to request a re-computation How to follow a re-computation

Issues affecting the monitoring infrastructure may incorrectly impact the availability results for a given site during a period of time. When these issues are confirmed, the site availability must be verified or sometimes updated. The availability can be overwritten in theses cases, so that they not affected by the monitoring infrastructure fault.

Conditions to request a re-computation

Requests for re-computation are only accepted:

  • when failures are due to problems in the monitoring infrastructure (e.g. invalid proxy certificate);
  • when reported up to 10 calendar days after the announcement of the reports of a given month, which normally occurs on 1st of the following month;
  • when the impact of the re-computation warrants the efforts, as explained here.

Below are some examples to better explain the conditions above:

  • Example 1
    • On 25-Jan-2012 region A requests the re-computation of the region availability due to a hardware problem on SAM-Nagios which happened on 15-Jan-2012.
    • The request is approved. The justification of the problem is valid and the request was reported on time (before the announcement of the report).
  • Example 2
    • On 05-Feb-2012 site B requests the re-computation of the site availability due to a problem with the host certificate of SAM-Nagios which happend on 15-Jan-2012.
    • The request is approved. The justification of the problem is valid and the problem was reported on time (5 days after the announcement of the first report).
  • Example 3
    • On 20-Feb-2012 region C requests the re-computation of the region availability due to a network problem on SAM-Nagios which happened on 15-Jan-2012.
    • The request is rejected. The problem was reported too late (more then 10 days after the announcement of the first report).

How to request a re-computation

Confirm that the availability of your site/region is affected by browsing the SAM3 service availability interface and open a new GGUS ticket. Please, assign the ticket to Grid Monitoring Support Unit (3rd level experts).

The GGUS ticket must include the following information:

  • A description of the problem;
  • The site or ROC/NGI affected by the problem;
  • The start and end time of the problem (yyyy-mm-ddThh:mm in UTC);
  • The VO affected by the problem;

How to follow a re-computation

The deadline for requesting re-computations is 10 calendar days after the announcement of the reports for a given month. Assuming the reports are announced on the 1st of the following month, the deadline for requesting re-computations will be the 11th of that month. As soon as the re-computation is complete the GGUS ticket is closed and the submitter notified. The SAM3 service availability interface can be used to confirm the new availability numbers. The final report will be published shortly after the deadline.

If there are no requests for re-computation the first reports published at the beginning of the month are considered final reports. In any case, after the deadline, no further requests will be considered.

Edit | Attach | Watch | Print version | History: r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r3 - 2020-08-21 - MaartenLitmaath
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    ArdaGrid All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback