-- Main.mbarroso - 20 Mar 2007
This page contains a collection of the answers given by the ROCs/sites to the network monitoring survey as requested by Mathieu Goutelle (SA2).
Asia Pacific
Our current network monitoring status and plans are as follows:
Smokeping:
We use smokeping to monitor the network latency, packet loss and jitter to each of the site in the region from ASGC. This allows us to monitor the network performance between the ROC and RC. This can also give us a general idea of the network quality for the RC, but may not be entirely accurate.
MRTG and Weathermap:
We use this to collect network usage information only from ASGC managed routers in our international networks.
Nagios:
Nagios is used to fault detection of ASGC owned network equipment and APROC RC center connectivity status. If a network fault occurs at any RC, we are notified through email.
In addition we are testing additional tools for network monitoring such as Perfsonar and IEPM. However this is still work in progress.
Please let us know if you need further information for NPM.
Central Europe
Regional installation of NAGIOS is used to detect problems with
networking links.
Monitoring of network traffic is done separately by sites. The most
popular is Ganglia - almost all sites declare to have it
installed. Also, Cacti is popular (
http://www.cacti.net/).
Additionally, following tools are used on sites:
Some small sites declare to have no network monitoring at all.
CERN
All information is contained in CERN monitoring portal:
http://cern.ch/monitoring
DECH
Network monitoring tools used at Gridka: Cacti and Netflow, IEPM
Cacti for collecting port statistics of throughput and error status.
With the help of Netflow we collect specific transfer statistics between endsystems.
In addition, an IEPM monitor is implemented at Gridka (see
http://lhc-opn-mon-fzk.gridka.de/iepm-bw.fzk.de/slac_wan_bw_tests.html).
All public monitoring information for Gridka can be accessed here:
http://www.gridka.de/monitoring/main.html
Please let us know if you need any more details.
France
Tier1 + Tiers2:
Network usage information: Cricket and Weathermap
http://netstat.in2p3.fr
Fault detection and alarm system (java client, sms, email, ...): home made tool
http://netsurv.in2p3.fr
Security and performance tools: home made tool
http://lpsc.in2p3.fr/extra/ (cf
NetFlow)
Tier1:
Other tools: RIPE TTM, IEPM,
PerfSonar, EGEE NPM
Italy
Monitoring and alarm tool used at INFN (in particular at tier1/tier2 sites):
MRTG to collect network usage information;
NAGIOS is used to fault detection and alarm system (via email and sms).
NetFlow/sFlow to get top speaker, top network flow, top network application, etc.
GridICE is used to monitor the grid information.
INTN-T1 and INFN-BARI are testing LEMON (only for monitoring purpose) in particulare for their storage system.
At CNAF we are also testing Argus (NIDS).
Northern Europe
No input yet.
Russia
We use the monitoring system developed at the IKI RAS that
collects netflow statistics from sites and presents it to
the network management staff for analysis.
We use Smokeping to detect the connectivity status to the
external networks.
Sites themselves and their NOC units are using the variety
of tools to monitor the network links and equipment status:
MRTG, Nagios, Cacti, Ganglia and some home-grown tools.
South East Europe
Our current network monitoring status and plans are as follows:
Smokeping:
We use smokeping to monitor the network latency, packet loss and jitter to each of the site in the region. This allows us to monitor the network performance between the ROC and RC. This can also give us a general idea of the network quality for the RC, but may not be entirely accurate.
http://mon.egee-see.org/cgi-bin/smokeping.cgi
MRTG and Weathermap:
We use this to collect network usage information from GRNET and SEREEN managed routers in our international networks.
Nagios:
Nagios is used to fault detection of RC center connectivity status in Bulgaria. If a network fault occurs at any RC, we are notified through email and or SMS.
If needed we can expand this to cover the SEE Region.
South West Europe
No input yet.
UKI
An overview of network monitoring in the UK is shown at
http://www.gridpp.ac.uk/gridpp18/P20070320-RT-18Collab.pdf