Let's suppose you have one or more local CVS and/or SVN repositories, and you want to merge them (with their change history) to new directories of an existing, possibly remote target SVN repository. This blog post explains how to do this using Unix tools.
The following tools will be used:
Please note that we won't use
svnsync, because it requires the target repository to be empty. We won't use
svn-merge-repos.pl much either, because it requires the target repository to be local. We won't use
svnadmin load much either, because it requires the target repository to be local.
Access remote repositories for the first time
For some repositories, you have to specify your username (usually in the command line) and password (usually answering to an interactive prompt) in order to be able to connect. For each machine, you have to do it once, because SVN records your username and password to files under
$HOME/.subversion/auth. To avoid problems connecting later, make sure you access the the remote repositories on each machine you'll be working on, so your credentials get saved. The easiest way to do it is to run
svn ls URL://TO/SVN/REPOSITORY --username MYUSER
Install svn-pusher
svn-pusher can add any SVN repository to another SVN repository (as a subdirectory) no matter local or remote, keeping the commit history of the specified revision interval. You can skip this installation step now and come back only if you are asked to use
svn-pusher in some of the steps below.
svn-pusher is implemented as a Perl script using SVN's Perl bindings (
SVN::Core). The easiest way to install
svn-pusher is with root access on a Unix system. For example, on Debian or Ubuntu, run this as root:
# apt-get update
# apt-get install libsvn-core-perl
# echo no | cpan -i SVN::Pusher # this may take about a minute
# type -p svn-pusher
/usr/local/bin/svn-pusher
# svn-pusher help
For the sake of completeness we mention (but we recommend against using)
svn-push as an alternative of
svn-pusher.
svn-push is a tool written in C, available in the SVN
contrib directory, and it does something like
svn-pusher, but it's dumber: it can commit only a single revision at once, and it needs the head revision number. Here is how to compile it:
$ wget http://svn.collab.net/repos/svn/trunk/contrib/client-side/svn-push/svn-push.c
$ sudo apt-get install libsvn-dev
$ gcc -W -Wall -I/usr/include/subversion-1 -I/usr/include/apr-1.0 \
-Doff64_t='unsigned long long' -o svn-push svn-push.c -lsvn_client-1
$ ./svn-push
Usage : svn-push -r N:M SRC_URL DEST_URL
For the sake of completeness, we also mention the
SVN::Push Perl module, which provides the
svnpush command. SVN::Pusher seems to be more up-to-date.
Convert the CVS repository to an SVN repository dump
You can skip this step if your source repository is not a CVS repository.
A CVS repository is a directory containing
*,v files (possibly in subdirectories), and containing a directory named
CVSROOT (or one of its parents containing the
CVSROOT).
Download and install
cvs2svn from
here. It's a Python script, so install Python as well (2.4 or 2.5 should be OK). You don't need root access to run
cvs2svn; in fact, it can be run as extracted from the tarball. Example:
$ wget http://cvs2svn.tigris.org/files/documents/1462/44372/cvs2svn-2.2.0.tar.gz
$ tar xzvf cvs2svn-2.2.0.tar.gz
$ cvs2svn-2.2.0/cvs2svn --help
You don't need Subversion itself for this step –
cvs2svn (if run with
--dumpfile=) needs only Python and the standard Unix
sort utility.
Make sure you have your CVS repository on the same machine as
cvs2svn. (Copy with
scp -r or
rsync if necessary.) It is a good and safe idea to make a copy and to use it for the purpose of the conversion. Make sure you have a neighbor or parent directory named
CVSROOT next to the repository. The
CVSROOT directory can be empty. Make sure that all files in your CVS repository directory are named
*,v (i.e. their name ends with
,v). If you don't need all files or all directories, feel free to remove them now. The effect would be as if those files and/or directories have never been added to the CVS repository.
Run
cvs2svn --dumpfile=PROJECT.dump PATH/TO/CVS/REPOSITORY . This creates the file
PROJECT.dump, which contains all files in the CVS repository, with their full commit history.
Convert the SVN repository to an SVN repository dump
You can skip this step if your source repository is not a SVN repository.
If your source repository is local and you have read access to it, just run
svnadmin dump PATH/TO/SVN/REPOSITORY >PROJECT.dump
Otherwise use
svn-pusher like this:
$ svnadmin create PROJECT.copy
$ svn-pusher push URL://OF/SVN/REPOSITORY PROJECT.copy
$ svnadmin dump PROJECT.copy >PROJECT.dump # this may take some time
$ rm -rf PROJECT.copy
As an alternative to
svn-pusher, you can use
svnsync as well (part of the standard SVN installation), see blog post
Dump a SVN repository from a URL how to do it. Please note that both
svn-pusher and
svnsync are quite slow (as compared to
svnadmin dump), and
svnsync is better supported and documented since it is part of standard SVN.
Once you have your
PROJECT.dump file, use
svndumpfilter (part of standard SVN) to get rid of the unnecessary files and directories. You may also edit the file in a text editor to do some other modifications (such as renaming files). The file format should be self-explanatory.
If you expect a file name conflict between the repositories (between source1 and source2 or source1 and target), e.g. multiple repositories have
trunk/version.h, it is safest to move/rename all the source repositories to their own directories, and once the merge is done, do a careful and safe
svn mv. Here is how to rename everything (e.g. from
DIR/TO/FILE to
PROJECT.merge/DIR/TO/FILE) in a
*.dump file:
$ perl -pi -e's@^(Node-path: )@${1}PROJECT.merge/@' PROJECT.dump
After that, please make sure you add the directory creation into revision 1 of
PROJECT.dump. Use your text editor to insert the following lines just below the first
PROPS-END line:
Node-path: PROJECT.merge
Node-kind: dir
Node-action: add
Merge an SVN repository dump to a local target SVN repository
You can skip this step if the target repository is not local.
The contents of the dump file
PROJECT.dump can be added to an existing local target SVN repository using
svnadmin load PATH/TO/TARGET/SVN/REPOSITROY <PROJECT.dump . An example for creating a new target SVN repostiory, and adding multiple projects to it:
$ svnadmin create myprojects
$ cvs2svn --dumpfile=myproject1.dump cvsrepo/dir/myproject1
$ svnadmin load myprojects <myproject1.dump
$ cvs2svn --dumpfile=myproject2.dump cvsrepo/dir/myproject2
$ # (edit myproject2.dump, see below)
$ svnadmin load myprojects <myproject2.dump
$ svn ls -R file://$PWD/myprojects
Please note that
cvs2svn adds the creation of the directories
trunk,
tags and
branches to the
*.dump file. This will be a problem in
svnadmin load myprojects <myproject2.dump, because this tries to add directory
trunk, which already exists in repository
mpyrojects, so the operation will fail. The solution is to edit the file
myproject2.dump, and remove the following lines from near the beginning:
Node-path: trunk
Node-kind: dir
Node-action: add
Node-path: branches
Node-kind: dir
Node-action: add
Node-path: tags
Node-kind: dir
Node-action: add
Merge an SVN repository dump to a target SVN repository
This steps works for both local and remote target SVN repositories, but it's a lot slower than the
svnadmin load method described above (which works only if the target SVN repository is local). Install
svn-pusher (see the step for it above). It doesn't matter which machine you install
svn-pusher to as long as it can connect to the target repository. Copy
PROJECT.dump created above to the machine you've installed
svn-pusher to. Then run
$ svnadmin create PROJECT.copy
$ svnadmin load PROJECT <PROJECT.dump
$ svn-pusher push file://$PWD/PROJECT.copy URL://TO/TARGET/SVN/REPOSITORY # slow
$ rm -rf PROJECT.copy
Please note that revision numbers in messages reported by
svn-pusher are usually off by one, e.g.
Committed revision 1 from revision 0. This is normal, the revision numbers will match perfectly in the target repository.
You can use
svn-pusher multiple times on the same target repository to merge multiple source repositories. If the target repository is not empty by the time you start
svn-pusher, it is safest to dump it first (to have a backup), and then please pay attention to the following facts.
svn-pusher (unlike
svnadmin load) reports a warning if a directory (e.g.
trunk) has already been added. This warning is usally harmless. If
svn-pusher wants to add a file which already exists, it will skip merging that source revision, but it will proceed merging subsequent source revisions. This is not always what you want, because you may want to fix the conflict first by renaming files, and only then proceed with subsequent revisions. Should this happen, you may have to rebuild the target repository from scratch using the backup.