This page is under construction.

CORAL and COOL repository migration from CVS to SVN

This page describes some of the issues encountered during the CVS to SVN migration of CORAL and COOL, and how they were solved (giving many more details than savannah task #10423). Note that POOL was not and will not be migrated to SVN. The two CVS repositories for CORAL and COOL have been separately migrated to two SVN (subversion) repositories, hosted by the CERN Central SVN Service. The migration was performed using cvs2svn, following the guidelines presented in the CERN Central SVN Service HOWTO.

In this process we tried to stick to the usage and conventions adopted by the LHC experiments in their migration from CVS to SVN. The following links have been useful.

The issues and solutions described below were initially tested using the cvs2svn "sandbox" at CERN and browser access to the corresponding Sandbox WebSVN portal. Due to the observation of some issues (e.g. in the display of binary PPT/JPG files), most of the tests were also repeated using the production WebSVN portal and TRAC, as well as using command line tools such svn co and wget in some cases. Now that the migration has been completed, the CORAL and COOL repositories can be browsed using using the production WebSVN and TRAC portals at the following links:

Migration tools

The migration was executed on an SLC6 node (slc6pf01) using the then most recent trunk version 5429 of cvs2svn. This was preferred to the latest stable SLC6 version 4998 as it implements a few new features that were initially thought to be useful for a better handling of symbols.

A copy of the latest migration tools and of all relevant input and data files have been copied to AFS on /afs/cern.ch/user/l/libcoral/cvs2svn, where they will be kept permanently. The tools have also been committed to SVN in the Cvs2Svn contrib area of COOL.

Handling of symbols (tags and branches) during the cvs2svn migration

Symbols (tags and branches): release vs development instances

Symbols (tags and branches): project-level vs package-level granularity

Advantages of package-level-granularity:

Disadvantages of package-level-granularity:

  • The 'prune' option was used in the cvs2svn migration to delete a directory once the last file has been deleted from it, but this has no effect on whole packages that are migrated as individual 'projects'. Obsolete packages that contain no files after a given revision (e.g. coral/AccessPlugin or coral/Tests/UnitTests) are not automatically 'pruned' from the SVN repository and need to be manually deleted after the cvs2svn migration. Luckily, this concerns only the trunk of such packages, hence this cleanup is not too complex.

Symbols (tags and branches): partial tags/branches lead to broken links in WebSVN

As an example, a wrong tag Coral-preview (correct tag is CORAL-preview) was applied to a single header file in CoralBase. In cvs2svn this is handled as if the tag had been applied to the whole directory CoralBase and had then been removed from all files but the tagged one. This is what is shown by WebSVN and TRAC:

After an extensive investigation of several cvs2svn features (including the "empty directories" option or the possible use of hints to "sprout" the "lods - see INC317008), it was eventually understood that this is essentially an issue in the cvs2svn migration algorithm (which tags/adds whole projects and removes/replaces individual files, rather than tagging/adding individual files), which is then reflected (in different ways) in the two source browsers. The possibility to change this in cvs2svn was considered, but this was eventually abandoned as it is too complex and error-prone while providing very limited benefits. Once the issue is understood, interpreting the information shown by both browsers is straight-forward.

Handling of file properties during the cvs2svn migration

Text and binary files were handled differently in the migration. We used file suffixes to differentiate between the two.

Text files (source code)

These are the main conclusions from our analysis of the issues observed in the migration of text files containing source code in the CORAL CVS.

  1. The svn:eol-style property must be set to native for all text files.
    • This is the recommended default for text files in SVN. We did not observe any issues related to this.
  2. The svn:mime-type property should preferably be left not set for all text files.
    • For C++ source code files, if this property is not set, WebSVN displays them inline with a nice syntax highlighting together with all SVN metadata. If svn:mime-type is set to svn:mime-type text/plain, WebSVN displays these files on a page of their own without syntax highlighting or SVN metadata (see INC317036). If svn:mime-type is set to application/x-cplusplus, svn:mime-type text/x-c++ or svn:mime-type text/x-c, WebSVN offers instead to download the files or open them with an external program.
    • TODO: check on TRAC (add some test files to the real repository?). We did not repeat the tests above with TRAC, but the display is good if svn:mime-type is not set.

Binary files (images and MS Office files)

These are the main conclusions from our analysis of the issues observed in the migration of binary files (JPG image, PPT presentations) in the COOL CVS.

  1. The svn:eol-style property must not be set for binary files.
    • Preventing this property to be set must be explicitly configured in cvs2svn. By default, cvs2svn would set this to native and files would be handled as text files.
    • If binary files are inserted in SVN with svn:eol-style native, they can no longer be opened correctly after a checkout with svn co. Note that this is a feature of SVN itself, not an issue with WebSVN or TRAC. For instance, in one migration test (where mime types were not defined and the ppt suffix had been forgotten from auto-props), ppt files were migrated with svn:eol-style native: when checked out with svn co, these files could not be opened with PowerPoint.
  2. The svn:mime-type property should preferably be set to application/octet-stream.
    • For PPT files, if this property is not set, WebSVN displays the binary file inline as a text stream of binary characters. If svn:mime-type is set to application/octet-stream (or also application/vnd.ms-powerpoint), WebSVN offers instead to download the files and/or open them with PowerPoint. We did not see any clear advantage in using application/vnd.ms-powerpoint over application/octet-stream. TRAC always offers to download the files and/or open them with PowerPoint whether svn:mime-type is set to application/octet-stream or is not set; it does however display the binary file inline as a text stream of binary characters if svn:mime-type is set to text/plain.
    • For JPG files, WebSVN and TRAC display the image as a standalone web page whether this property is set to application/octet-stream or image/jpeg or is not set. We only saw potential issues if it is set to text/plain (TRAC and the buggy sandbox instance of WebSVN display an inline stream of bytes as text, but the working production instance still shows the image even in that case). We did not see any clear advantage in using image/jpeg over application/octet-stream.
  3. The svn:executable property should preferably be set to *.
    • This was not tested in detail, but this property seemed to be largely irrelevant. Since PPT and JPG files coming from Windows are generally seen as executables (and get this property set when added to the SVN repository from scratch), it seems safer to add this property everywhere for consistency.
  4. The 'sandbox' instance of WebSVN seems affected by a bug that is absent in the 'production' instance.
    • Neither PPT nor JPG files could be correctly opened or displayed when opened through the sandbox instance (https://svnweb.cern.ch/cvs2svn/wsvn), while they could be correctly opened and displayed when retrieved through the production instance (https://svnweb.cern.ch/cern/wsvn). This was reported in INC324979.
    • This is a problem with WebSVN rather than with the cvs2svn migration settings or with the SVN properties of the files. We checked that PPT and JPG files can be opened and displayed if they are downloaded via svn co. We also checked that the SVN repository (the contents of the db directory) were strictly identical in a test where we added from scratch some PPT and JPG files to two repositories connected to the 'sandbox' and 'production' instances of WebSVN. We also checked that no issue was seen on the production repository with TRAC (but there is no TRAC available connected to the sandbox).
    • Visual inspection of the files downloaded through the sandbox instance of WebSVN (either directly from a browser or via wget with cookies to access protected pages) suggests that these files get corrupted by the addition of a leading character. Their size is exactly one byte larger than expected and the corruption can be removed by manually erasing the first character in a text editor (the files can then be opened and displayed normally after that).

Auto properties based on file suffixes

These are all of the distinct suffixes from the files committed to CORAL and COOL CVS:

  • From echo `find lcgcoral-201307/ -type f -name '*,v' -exec basename {} ,v \; | awk '{n=split($0,a,"."); if (n>1) print a[n];}' | sort -u`:
    • 2 AK awk bat c C cfg cmt conf cpp cpp~ cpp_Govi cproject csh css csv cvsignore cxx db doc dtd err fig gz h hpp htaccess html icpp inc inl jpg log mk notes ora out php pid15613 pid2369 pl pm png project py pyc qmc qmt ref rules sh sql summary supp tmpl txt xml zip
  • From echo `find lcgcool-201307/ -type f -name '*,v' -exec basename {} ,v \; | awk '{n=split($0,a,"."); if (n>1) print a[n];}' | sort -u`:
    • 0 1 123 2 actions adjustTimeZone ANALYZED awk bat bhost bmp c cfg CLOB CMS_ECAL cmt commonName conf COOL COOL_1_1_0 CORAL cpp cppp csh css ctf cvsignore cxx dat db debug default141 diff dlclose doc doNotAdjustTimeZone dox Doxyfiles el EmptyFile error fatal Foundation free frontier full g gdb ggo gif gmt gmtime gp gz h HEAD hpp htaccess htm html icpp in info jpg laptopLinux laptopLinux2 laptopLinux3 leak lfc log lookup mac mht mk mk~ MutexLock mysql MySQL MySQLAccess new noclob noDlclose noexec noleak noPool NOPOOL noPurge notes NT OCI_DEFAULT ociTest OCI_THREADED opts ora Oracle osx103_gcc33 osx104_ia32_gcc401 osx104_ia32_gcc401_dbg osx104_ppc_gcc401 osx104_ppc_gcc401_dbg out patch pdf pem Performance php pl png ppt pptx prf py pyc python qmc qmr qms qmt ref rh73_gcc32 rh73_gcc323 rh73_gcc323_dbg rh73_gcc32_dbg rootrc SEAL security segmentationFault sh skipGrant slc3_amd64_gcc344 slc3_amd64_gcc344_dbg slc3_ia32_gcc323 slc3_ia32_gcc323_dbg slc3_ia32_gcc323_gcov slc3_ia32_gcc323_test slc3_ia32_gcc344 slc3_ia32_gcc344_dbg slc4_amd64_gcc34 slc4_amd64_gcc345 slc4_amd64_gcc345_dbg slc4_amd64_gcc34_dbg slc4_ia32_gcc34 slc4_ia32_gcc345 slc4_ia32_gcc345_dbg slc4_ia32_gcc34_dbg slc4_ia32_gcc41 sql sqlite sqlTrace src standalone STANDALONE summary supp swp tar templ template tests tex threads timing tpl trc txt typ ui unset verbose warning win32_vc71_dbg win32_vc71_dbg_cmt win32_vc71_dbg_cyg win32_vc71_dbg_wine windows wine wineTest wineVsCygwin xls xml xslt
  • Of these, the following suffixes have been included in auto-props in order to handle the corresponding files as binary:
    • bmp db doc fig gif gz jpg pdf png ppt pptx pyc qmr swp tar xls zip
  • All other files have been handled as text files in the migration (even if a file not included above is marked as binary in CVS via the '-kb' property, this is ignored and the file is migrated as text). SVN keywords were not set for any files anyway, including text files.

POOL was migrated using the same auto-props as for CORAL and COOL, described above. A posteriori, it was checked that these are all of the distinct suffixes from the files committed to POOL CVS:

  • From echo `find lcgpool-201308/ -type f -name '*,v' -exec basename {} ,v \; | awk '{n=split($0,a,"."); if (n>1) print a[n];}' | sort -u`:
    • 0-toolbox 1-toolbox 6-toolbox bat C cat cfg cmt conf cpp csh css cvsignore cxx defunc doc dtd env eps fig gif h h~ hmtl hmtl~ htm html html~ inl jpg log lxshare070d mdl mk mpp notes options pdf pjt pl pm png ppt py qmc qmr qms qmt reader ref rules saved sh sql svg sxi sxw test tmpl txt vthought writer xml zip
  • It was not checked explicitly if any of these suffixes represent additional binary file types that should have been handled differently in the migration and/or recovered a posteriori. However (see below) it was checked whether the CVS POOL-preview tag could be recovered from SVN. It was observed that a single file AttributeList/doc/AttributeList-pool-component.sxw had been corrupted in the cvs2svn migration and had to be recovered back from CVS.

Post-migration configuration and cleanup

TRAC administration

TRAC administration.

Fix an issue with TRAC after rerunning the COOL migration (to move VerificationClient from contrib to cool).

Configure access control and commit hooks

The following post-migration actions involve modifications to the conf and usr-hooks sub-directories of the SVN repositories /afs/cern.ch/project/svn/reps/<project>.

  • These changes could be executed at the file-system level (using the AFS ACL of accounts libcoral and libcool). However, this would not provide any versioning for those scripts.
  • Alternatively, conf and usr-hooks can be modified in the special 'admin' SVN repository, as explained in the CERN SVN Service HOWTO. The admin repository can also be browsed (using the appropriate account) on WebSVN (but not on TRAC).
  • The lcgcool, lcgcoral and lcgpool subdirectories of the admin repository were initially accessible in read-write mode to the libcool, libcoral and libcoral librarian accounts, respectively. All three were also accessible to the primary owner account valassi.
    • Created three egroups VC-librarians-lcgcool, VC-librarians-lcgcoral and VC-librarians-lcgpool and sent requests (RQF0236976 and RQF0249689) to add them to the allowed admins. According to the doc this should have worked out of the box, but the doc was obsolete.

Hooks to send automatic emails on SVN commits.

  • Enable svn-mailer.py in /afs/cern.ch/project/svn/reps/<project>/usr-hooks/post-commit.
  • Disable revision diffs in /afs/cern.ch/project/svn/reps/<project>/conf/svn-mailer.conf. The email address for notifications is already set.

Hooks to forbid revision log changes (all other revision property changes are already forbidden).

  • Forbid revision log changes in /afs/cern.ch/project/svn/reps/<project>/usr-hooks/pre-revprop-change.

Configure access control.

  • Grant librarian privileges to avalassi and valassi in /afs/cern.ch/project/svn/reps/<project>/conf/authz. Remove the "Andrea.Valassi" group.
  • Request anonymous read access for all three repositories (RQF0239694 and RQF0249689).

CVS write access removal and SVN repository cleanup

Remove write access from CORAL (lcgcoral), COOL (COOL) and POOL (PF) CVS repositories

  • Only allow commits anywhere by cvsadmin and the libcoral/libcool/libcoral librarian accounts for the three projects.
  • In addition, only allow commits by avalassi under CVSROOT.
  • Forbid tags from anyone anywhere.
  • Requested (RQF0237336) that in the AFS ACL the three previous CVS librarian accounts cvscoral, cvscond and cvspf be replaced by the SVN librarian accounts libcoral, libcool and libcoral (also for POOL).
    • Deleted the cvscond, cvscoral and cvspf accounts after the AFS ACL changes were done.
    • Note that ViewCVS will be switched off at the same time as the CVS servers.

SVN repository cleanup for CORAL and COOL.

  • Last SVN revisions from the cvs2svn migration (18994 for lcgcoral and 18528 for lcgcool): create 'cvs201307' tags describing CVS before the migration.
  • First SVN commits in both projects (18995 in lcgcoral and 18529 in lcgcool): remove the cvs2svn:cvs-rev property from all files in trunk/CVSROOT.
  • Commit to SVN trunk/CVSROOT the two last CVS changes that remove write access (18997 in lcgcoral and 18531 in lcgcool).
  • Remove trunk of obsolete packages migrated as individual 'projects' (18998 in lcgcoral and 18532 in lcgcool/cool and 18533 in lcgcool/contrib).
  • Remove svn:executable property from .h and .cpp files in trunk, patches tags and active branches. This was not cleaned up in CVS as these files were installed as executable in AFS releases.
    • Remove svn:executable property from .h and .cpp files in lcgcoral trunk (18999).
    • Remove svn:executable property from .h and .cpp files in lcgcoral CORAL-preview (19000).
    • Remove svn:executable property from .h and .cpp files in lcgcoral CORAL_2_3-patches (19001).
    • Remove svn:executable property from .h and .cpp files in lcgcool trunk (18534).
    • Remove svn:executable property from .h and .cpp files in lcgcool COOL-preview (18535).
    • Remove svn:executable property from .h and .cpp files in lcgcool COOL_2_8-patches (18536).
  • Keep a copy of the latest CORAL_2_3-ATLAS-branch in CVS as tag cvs201307_CORAL_2_3-ATLAS-branch (19002).
  • Move doc and CVSROOT out of CORAL releases
  • Move doc and CVSROOT out of COOL releases
  • First productions releases based on SVN
    • New release CORAL_2_3_27a with tag copied from CORAL_2_3-patches (lcgcoral:19007)
    • New release COOL_2_8_18a with tag copied from COOL_2_8-patches (lcgcool:18541)
  • Ensure that branches contain whole directories so that they can be checked out over the checkout of the trunk (note that 'svn status' will show an "S" to indicate this 'switch').
    • Copy the whole of CoralBase/CoralBase from CORAL_2_3-patches to CORAL_2_3-branch (19008 to lcgcoral:19010). Previously only VersionInfo.h was in the branch.
    • Copy the whole of CoolKernel/CoolKernel from COOL_2_8-patches to COOL_2_8-branch (lcgcool:18546). Previously only VersionInfo.h was in the branch.
  • Remove the cvs2svn:cvs-rev property from all files in the trunk and active branches or sliding tags.
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral trunk (lcgcoral:19011)
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral CORAL-preview (lcgcoral:19012)
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral CORAL_2_3-patches (lcgcoral:19013)
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral CORAL_2_3-branch (lcgcoral:19014)
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral CORAL_2_3-ATLAS-branch (lcgcoral:19015)
    • Remove cvs2svn:cvs-rev property from all files in lcgcoral CORAL_2_4-branch (lcgcoral:19016)
    • Remove cvs2svn:cvs-rev property from all files in lcgcool trunk (lcgcool:18542)
    • Remove cvs2svn:cvs-rev property from all files in lcgcool COOL-preview (lcgcool:18543)
    • Remove cvs2svn:cvs-rev property from all files in lcgcool COOL_2_8-patches (lcgcool:18544)
    • Remove cvs2svn:cvs-rev property from all files in lcgcool COOL_2_8-branch (lcgcool:18545)
    • This has the well-known downside that all files will appear to have been recently modified, even if they have not really changed in a long time. Unfortunately, this property should not be modifed in commit hooks. Removing them during the cvs2svn migration (in the CVSRevisionNumberSetter class) after the last CVS revision of a file is committed seems too complex compared to its benefits.The only remaining options are to not move the metadata in the migration (which would be a pity as they are used in savannah to document code changes), keep them forever (which may lead to very confusing situations) or remove them after the migration (which has this side effect that all files appear to have been recently modified).
  • TODO: add SVN keywords for all non-binary files?
  • TODO: check how to configure auto props server-side for newly committed files
  • TODO: archive (in SVN?) the tools and data for the cvs2svn migration.

Minimal SVN repository cleanup for POOL.

  • The last SVN revisions from the cvs2svn migration for POOL is 25744.
  • A minimal cleanup of the SVN repository was performed for POOL, limited to the trunk and especially the POOL-preview tag (up until revision 25751). In particular, the SVN repository was changed (by removing several files from the POOL-preview tag and by also replacing one binary file) until the following gave no remaining differences:
    • diff -r POOL-preview/ pool.release/ --exclude=.svn --exclude=CVS -I'$Id' -I'$Date' -I'$Header' --exclude.cvsignore=
  • In the CVS directory, however, several files had to be processed through dos2unix before running the above diff:
    • dos2unix AttributeList/doc/AttributeList-pool-component.sxw Collection/ChangeLog Tests/Collection_BackNavigate/src/Collection_BackNavigate.cpp Tests/Collection_ExplicitReadPerformance/src/Collection_ExplicitReadPerformance.cpp Tests/Collection_ExplicitWritePerformance/src/Collection_ExplicitWritePerformance.cpp Tests/Collection_FileInfoRetrieve/src/Collection_FileInfoRetrieve.cpp Tests/Collection_MultiFileUpdate/src/Collection_MultiFileUpdate.cpp Tests/Collection_MultiFileWrite/src/Collection_MultiFileWrite.cpp Tests/Collection_Update/src/Collection_Update.cpp config/doxygen/Doxyfile config/doxygen/Doxyfile_POOL.cfg

Repository improvements for CORAL and COOL using the SVN-specific 'external' feature.

  • Avoid the duplication of common files in both CORAL and COOL using SVN externals.
    • Remove USERCONTEXT/avalassi from COOL trunk and take it as an external from CORAL trunk (lcgcool:18581)
    • Remove USERCONTEXT/avalassi from CORAL and COOL nightly tags and a fortiori from release tags (lcgcoral:19114 and lcgcool:18667)
  • Avoid the duplication of logs across CORAL trunk, tags and branches
    • Move lcgcoral/coral/trunk/logs to lcgcoral/logs (lcgcoral:19072)
    • Remove logs from CORAL_2_3-patches and CORAL-preview (lcgcoral:19073)
    • Take lcgcoral/coral/trunk/logs as an external from lcgcoral/logs (lcgcoral:19074)
    • Take lcgcoral/coral/tags/CORAL-preview/logs as an external from lcgcoral/logs (lcgcoral:19075)
    • Take lcgcoral/coral/tags/CORAL_2_3-patches/logs as an external from lcgcoral/logs (lcgcoral:19076)
  • Avoid the duplication of logs across COOL trunk, tags and branches
    • Move lcgcool/cool/trunk/logs to lcgcool/logs (lcgcool:18599)
    • Remove logs from COOL_2_8-patches and COOL-preview (lcgcool:18600)
    • Take lcgcool/cool/trunk/logs as an external from lcgcool/logs (lcgcool:18601)
    • Take lcgcool/cool/tags/COOL-preview/logs as an external from lcgcool/logs (lcgcool:18602)
    • Take lcgcool/cool/tags/COOL_2_8-patches/logs as an external from lcgcool/logs (lcgcool:18603)
  • Avoid the duplication of common qmtest tools in both CORAL and COOL using SVN externals (for whole directories) and symbolic links (for individual files).
    • Move qmtest tools in CORAL from logs/qmtest to logs/qmtest/tools and add them back as symbolic links (lcgcoral:19086)
    • Remove qmtest tools in COOL from logs/qmtest and add them back as symbolic links to the (CORAL external) logs/qmtest/tools (lcgcool:18620)
  • WARNING In some cases, the ssh password was asked again while "fetching external items" from SVN. No clear pattern was observed. Removing "-q" from the ssh command in ~/.subversion/config, some specific IP addresses in the svn.cern.ch cluster were printed out, but then these were tested explicitly and no authentication problems were observed. Suing ssh sharing does not seem to appropriate (and in any case it does not work if ~/.ssh is on AFS). In case any issues are observed again, it may be worthwhile to test specific nodes again anyway, as the svn.cern.ch cluster has several nodes (many more than the three nodes visible using the 'host' command).

-- AndreaValassi - 21-Aug-2013

Edit | Attach | Watch | Print version | History: r33 < r32 < r31 < r30 < r29 | Backlinks | Raw View | WYSIWYG | More topic actions
Topic revision: r33 - 2016-04-09 - AndreaValassi
 
    • Cern Search Icon Cern Search
    • TWiki Search Icon TWiki Search
    • Google Search Icon Google Search

    Persistency All webs login

This site is powered by the TWiki collaboration platform Powered by PerlCopyright &© 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
or Ideas, requests, problems regarding TWiki? use Discourse or Send feedback