SHORT UNAVAILABILITY OF ACCMEAS & LASER DATABASES

Description

Oracle database virtual IPs (vip) and listeners went down on the ACCMEAS and LASER databases on Wednesday (11th April) afternoon causing a short unavailability of those databases for about 5 minutes.

Impact

  • It was not possible to make any new connections to the ACCMEAS and LASER databases. Existing connections were unaffected.

Time line of the incident

  • 11-Apr-12 16:46 - ACCMEAS and LASER database vips and listeners went down. Detected immediately by the monitoring systems (RACMON and OEM).
  • 11-Apr-12 16:50 - vips and listeners were manually restarted by Dawid.

Analysis

  • A 5-year old change to configuration on quattor templates introduced a hidden (and to date unseen) dependency which led to a network restart on ACCMEAS and LASER database servers.
  • Further investigations showed that
    • RHEL4 machines (dbracdes, dbraccastor and dbracacc clusters) were risking the same problems;
    • RHEL5 machines (dbgen3 and database clusters) did not have the same dependency, but there were two other components that add a dependency on network: rac and sendmail.

Follow up

  • Remove the network component dependencies from other components:
    • On dbracdes, dbraccastor, dbracacc clusters (10g on RHEL4):
      • turn off the "dispatch" of dirperm
      • remove the nfs -> network dependency
    • On ncm-rac configuration (10g on RHEL4 and RHEL5):
      • remove the rac -> network dependency
    • On hardware bunch and service templates for dbgen3 and database clusters (10g and 11g on RHEL5):
      • remove the sendmail -> network dependency

  • All actions implemented by Giacomo on Thursday 12 April (15:30).

  • Add explicitly in the instructions for operators that network must not be restarted in the database servers (message will prompt on login)

-- EvaDafonte - 13-Apr-2012

Edit | Attach | Watch | Print version | History: r4 < r3 < r2 < r1 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r4 - 2012-04-20 - EvaDafonte
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    DB All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback