Proposal for LCG and/or EGEE to standardize on SL as the base Linux platform. ----------------------------------------------------------------------------- Problem: -------- LHC experiments expect to find compatible runtime environments at all participating laboratories. LCG uses tag-based job allocation that cannot easily cope with a variety of operating environments. Validation of a new platform for all experiments' use is largely manual and cannot be repeated too often. There is a clear need for standardization. Sites need to customize their operating environment, for example, to enforce local security policies, this is expected to continue into the future. Sites also have non-matching release cycles and different updating policies. Recent developments: ------------------- Most of the HEP sites see Red Hat Enterprise 3 as a viable successor to the various Red Hat-based solutions currently deployed [HEPiX Vancouver/Edinburgh]. This is available as a supported commercial product from Red Hat, and freely (but unsupported) in recompiled form by various groups, including CERN and Fermi. Security updates are available via a vendor support contract or in source form for free. All of these versions are binary compatible with each other (this is an explicit design goal of the recompilation efforts), but the installed software packages may vary between sites. Fermi has made an effort to separate their site customizations from a reusable set of core packages. Fermi also provides a set of scripts to allow for easy per-site customizations, including the facility to "re-brand" and to add/replace software packages. Core packages and scripts have been released to the HEP community under the name of "Scientific Linux". Fermi will base their next production release on this, CERN is evaluating whether a switch can be done this late in the ongoing certification process. Several HEP institutes (e.g. TRIUMF) have expressed strong interest, and others were following CERN's distribution in the past and will probably do so in the future (e.g. NIKHEF). Several of the US DOE labs have purchased commercial Red Hat under a recently negotiated agreement. A similar agreement is expected to apply to other HEP labs. SLAC currently runs the commercial version in production. This presents a window of opportunity to get a large part of HEP to agree on a (set of) common (binary compatible) distributions). Initial discussion shows support for common packaging and deployment policies (prefer back ported fixes to "core" packages, do not replace "core" packages except for security), but neither complete synchronization between sites nor "free" support for other institutes should be expected. Proposal: --------- The common set of packages in Scientific Linux (before site customization) will be binary compatible with the commercial Red Hat Enterprise 3 Linux. Scientific Linux "sites" will be discouraged from modifying this in incompatible ways. It will be possible to certify physics applications and middleware on either Scientific Linux or commercial Red Hat Linux interchangeably. LCG should ask collaborating experiments to validate their production software on one of the Scientific Linux derivates or a suitably restricted subset of Red Hat Enterprise. Since nearly all of these will require per-site modifications, care has to be taken during the build process not to pick up add-on or customized packages. Such software would then be expected to run on all SL-derived operating environments without further per-site certification. It should also show a high degree of resiliency against the inevitable configuration "dithering" from software updates. Automatic testing/ validation suites are still recommended to allow easy transitions between releases. Development should be encouraged to stay with as few versions of libraries as possible, to facilitate later rollout into a new "common" release. The practice of "locking" onto certain library versions conflicting with the standard system installation is expensive to maintain (indefinite availability required), and the "production" releases should able to use pre-installed software as far as possible. Scope: ------ Initially until 12/2005 (current minimum lifetime across Fermi and CERN). Extension is possible given that the underlying base product (Red Hat Enterprise 3) will get security updates until 2008. Endorsement: ------------ written by Mark Kaletka (Fermi), Jan Iven (CERN) Footnote: --------- Red Hat has offered serious price discounts for the commercial version, but seems to have not yet grasped the uniqueness of the HEP community. Furthermore, it looks like only a combined package of all their software management products and services would allow replacement of local support efforts. (Our current understanding is the base "entitlement" price includes only security updates with no direct technical support, with updates directly from Red Hat which do not scale well and are without admin control. High-level consultant-like support service comes only with the Technical Account Manager (TAM), this requires additional per-node support. Software management on-par with existing implementations requires per-node fee and "Satellite Server" product.) These offers are still being tested. CERN and Fermi have announced that their next production versions will be based on a "free" alternative, the next opportunity to switch would be in 2005.