--- myst: html_meta: "description": "Historical documentation for developers on the process of migrating the eOn project from Subversion (SVN) to Git." "keywords": "eOn SVN migration, git svn, version control history, repository migration" --- # Migrating from SVN ```{note} This section should only be relevant until the `2.x` release at which point the SVN repository shall be completely removed. ``` The basic idea is to use `git svn`[^1], following the standard [git book approach](https://git-scm.com/book/en/v2/Git-and-Other-Systems-Migrating-to-Git). ## Getting the SVN sources For the first part we don't need **every** commit, so a checkout will work over `clone`. ```{code-block} bash svn co https://theory.cm.utexas.edu/svn/eon eonSVNonly ``` We will use this to determine author information. ```{code-block} bash cd eonSVNonly svn log > loggy svn log --xml --quiet | grep author | sort -u | \ perl -pe 's/.*>(.*?)<.*/$1 = /' > ../authors.txt ``` Now, the `authors.txt` file can be edited to map to existing GitHub users. It is also useful to grab the `loggy` file to try to timestamp when each user was active. ## Importing history With these in hand, we can now actually import the `svn` history[^2] via: ```{code-block} bash mkdir eon_svn && cd eon_svn git svn init https://theory.cm.utexas.edu/svn/eon --no-metadata --prefix "" git config svn.authorsfile ../authors.txt git svn fetch ``` This will take much longer, since `fetch`, unlike `checkout` downloads all changes. Cross-reference with the "fuller" variant from here: ```{code-block} bash svn log | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt # Includes things like (no author) = (no author) ``` By project convention, if the command fails with an author not defined use `username = username `, then retry the `git svn fetch`. Also note that it is by far preferably to use Github handles for email addresses, `` to reduce leaking personally identifiable information. ## Merging upstream Since the SVN source has no tags or branches, there are no cleanup tasks required. ```{note} `eOn` used to use CVS for managing history, which was not imported when the switch to SVN took place, so rememeber to rebase a dummy commit co-authored by contributors before 2010! ``` Since we already have `svn` as a branch and even tags, it is a bit of a pain to take updates. Essentially we will have to rebase onto the "new" commits. ```{code-block} bash git branch -d main # extraneous git checkout -b svn_new git remote add origin git@github.com:TheochemUI/eongit git pull # Add in the commits from the older svn branch # Just the gitignore bits git cherry-pick b8794c316882b09e6421a3ba9f707a893accb636 git cherry-pick 35ea1448ee8707e06d4b314fbdfb0fabe859a206 ``` Now we are ready to cut and push a new tag. ```{code-block} bash git checkout svn_new git checkout -b main_tk3 git reset --hard $LAST_TAG_COMMIT # July 1 this is 0e263123 # from to git rebase --ignore-whitespace --rebase-merges --onto main_tk3 b8794c3^ 5505ae2 # After fixing many changes 594826a git branch main_fin git push -u origin main_fin # Adding back the newer commits after the reset git rebase --ignore-whitespace --rebase-merges --onto main_fin 0e26312^ c267e1c git push origin HEAD:main_fin ``` Note that the `ignore-whitespace` option might be problematic for Python changes, but for C++ only changesets it should be fine. Documentation only commits are ported separately, and skipped during the rebase, since these will be in files "deleted by us". ```{code-block} bash # Open next both merged emacsclient -n $(git diff --name-only --diff-filter=U | head -n 1) ``` ```{note} Remember to diff the folders (e.g. via `meld`) after this operation, and confirm only expected changes are present. ``` [^1]: Atlassian has [pretty decent documentation and helpers](https://www.atlassian.com/git/tutorials/svn-to-git-prepping-your-team-migration), but, Java is a pain. [^2]: From this [SO answer](https://stackoverflow.com/a/79188/1895378)