Summary of the HTTP Deployment TF's Activities

The HTTP Deployment TF's activities fell into two areas, as described below.

Policy and advice on HTTP for WLCG

The TF created documents and tools to guide adoption of HTTP within WLCG

Operational support for HTTP deployment

The TF created a Nagios probe for use with the SAM/Nagios framework and, with the help of the monitoring team, began to monitor endpoints based on lists from the experiments.

At the beginning of the operational push, the monitoring showed 36 problematic endpoints for Atlas and a fraction of the LHCb ones (the exact number was not recorded). After a period of ticketing the sites, during which 79 GGUS tickets were issued and followed up, the situation (at time of writing) shows 7 problematic endpoints for Atlas, and 2 for LHCb. All sites have been notified about problems detected.

After cleaning up a number of configuration issues related to sites, to the experiment configuration databases, and to the monitoring, most of the effort was dedicated to tackling instability in the endpoints. A number of issues in DPM configuration were uncovered and advice on how to properly support HTTP was summarised and circulated.

Monitoring

During the life of the TF, the monitoring was based purely on ETF. It was decided that once this system became the source of data for production monitoring via SAM3, responsibility for "HTTP operations" would pass to the experiments. This has now happened.

ETF Monitoring

Atlas - https://etf-atlas-prod.cern.ch/etf/check_mk/index.py?start_url=%2Fetf%2Fcheck_mk%2Fview.py%3Fview_name%3Dservicedesc%26service%3Dwebdav.HTTP-All-%2Fatlas%2FRole%253Dproduction

LHCb - https://etf-lhcb-prod.cern.ch/etf/check_mk/index.py?start_url=%2Fetf%2Fcheck_mk%2Fview.py%3Fview_name%3Dservicedesc%26service%3Dwebdav.HTTP-All-%2Flhcb%2FRole%253Dproduction

Site View

Atlas - http://wlcg-mon.cern.ch/dashboard/request.py/siteviewhistory?columnid=1337&debug=false

LHCb - http://wlcg-mon.cern.ch/dashboard/request.py/siteviewhistory?columnid=1318&debug=false

The future

To fully profit from the monitoring now in place, the experiments will have to enable vo-feeds which explicitly identify HTTP as a separate service and list the relevant endpoints. This will enable them to integrate HTTP endpoints into their standard operations and thus maintain or improve the stability of the infrastructure.

-- OliverKeeble - 2016-04-29

Edit | Attach | Watch | Print version | History: r5 < r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r5 - 2016-08-04 - OliverKeeble
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    LCG All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback