Wednesday, March 23, 2011

Using git to synchronize and backup home directories

With git-home-history gone, and nothing suitable to take its place, one must handroll a decent solution to version controlling a home directory.  Since git can synchronize repositories, it should also be possible to use it to synchronize the contents of a home directory on multiple machines.

This example will assume that a desktop computer (hostname 'desktop')contains the master ('remote') repository, and that a laptop (hostname 'laptop') contains the slave (local) repository.

NOTE: It is important to be aware of which files should be left out of the repo. Because a laptop and a desktop will have different graphics hardware, the settings for GUI applications such as web browsers and window managers should not be in the repo. Also, files which change a lot (cache files, .bash_history, etc) should be kept out of the repo as they will always cause merge conflicts. Finally, private keys should be left out of the repo.

Finally, need it be said that the home directories of both machines should be backed up before trying this?


Desktop: Create and fill the repository

To begin with, create an empty git repo:

bash$ cd ~
bash$ git init .
bash$ touch .gitignore
bash$ git add .gitignore

Next, modify the .gitignore file to select which files or directories to leave out of the repo:

bash$ vi .gitignore
.*
Desktop
Downloads
Templates
tmp
mnt
*.log
*core
*.swp
*.swo
*.bak

This example ignores backup, swap, and core files, as well as directories that probably shouldn't be shared between the two machines (Desktop, Downloads, Templates, tmp, mnt). Note that all hidden files are left out of the repo by default (.*).

Now add all allowed files into the repo:

bash$ git add .

Now add exceptions to the .gitignore file. These will be config files that are shared between the two machines:

bash$ git add -f .bashrc .xsessionrc .vimrc .gvimrc .gdbinitrc  .ssh/config .local/share/applications

If Firefox was configured correctly (i.e. by making a tarball of ~/.mozille/firefox on the desktop machine and extracting it to ~ on the laptop machine, instead of letting Firefox generate its own config), then the bookmarks file can be added as well:

bash$ git add -f .mozilla/firefox/*.default/bookmarks.html

This of course holds true for non-config data in the Firefox dir, such as .mozilla/firefox/*.default/ReadItLater (UPDATE: but not zotero, as it updates itself even while it is not being modified) .

Finally, commit all of the contents to the repo:

bash$ git commit -m 'Initial home dir checkin'

The git directory now has a starting version of the home directory checked in. It can be reviewed with a tool such as QGit to  ensure nothing is missing or unwanted:

bash$ qgit &

NOTE: To make the following operations go smoothly, the following line must be added to .git/config :

[receive]
    denyCurrentBranch = false



Desktop : Create a script to auto-commit

At this point. it is useful to create a shell script that performs a commit in the background.

bash$ mkdir -p bin
bash$ vi bin/git_commit_homedir.sh
#!/bin/sh                                                                      
cd ~
git add .
git commit -m 'automated backup' . 
bash$chmod +x bin/git_commit_homedir.sh

Laptop : Clone the repository

On the laptop, clone the repository from the desktop:

bash$ cd ~
bash$ mkdir -p tmp/git-repo
bash$ cd tmp/git-repo
bash$ git clone desktop:/home/$USER

Note that the repo was cloned to a temporary directory so that it will not overwrite any local files. This is important!

Move the git metadata directory to the home directory:


bash$ cd $USER
bash$ mv .git ~

Retrieve any missing files (i.e. that exist on the desktop but not on the laptop, such as .gitignore) from the repository:

bash$ cd ~
bash$ git checkout \*


Laptop: Add local changes

Create a branch for the changes that will be made next:


bash$ git checkout -b laptop



Add any local exclusions to the .gitignore file:


bash$ echo .pr0n >> .gitignore


Add any additional files to git:


bash$ git add TODO NOTES

Commit the branch:


git commit -m 'laptop additions'



Now merge the branch into master:


bash$ git checkout master
bash$ git merge laptop


Verify that the changes are suitable:


bash$ qgit &

Finally, push the changes to the desktop:


bash$ git push


Desktop: Generate canonical file versions

The desktop will now have all its files set to the laptop versions.

At this point, files that have been modified should be reviewed and editted, so that a canonical version will be stored in the repo and used by both the desktop and the laptop. QGit makes the review process fairly simple.

Note that some config files will have to source local config files that lie outside the repository (i.e they are excluded in /gitignore). For example, .bashrc might have a line like

[ -f ~/.bash_local.rc ] && . ~/.bash_local.rc

...and .vimrc might have a line like

if filereadable(expand("$HOME/.vim_local.rc"))
    source ~/.vim_local.rc
endif

The files .bashrc_local.rc and .vim_local.rc will be listed in .gitignore, and will have machine-specific configuration such as custom prompts, font size (e.g. in .gvimrc), etc.

Once the canonical versions of the files have been created, they are committed :

bash$ git commit -, 'canonical version' .
bash$ git tag 'canonical'

Laptop: Pull canonical versions

The canonical versions can now be pulled down to the laptop. Note that any supporting files (e.g. .bash_local.rc) will have to be created on the laptop.

bash$ git pull

Laptop & Desktop : Add cron job

In order for git to automatically track changes to the home directory, both the laptop and the desktop will need to add a cron job for running git_commit_homedir.sh .

The following crontab will run git every two hours:

bash$ crontab -e
0 */2 * * * /home/$USER/bin/git_commit_homedir.sh 2>&1 > /dev/null

...of course $USER must be replaced with the actual username.

Note: Some provision must be made for pushing the laptop repo to the desktop. This can be done in a cron job, but is probably better suited to an if-up (on network interface up) script.

UPDATE: Be careful when pushing; the desktop must be forced to update its working tree, or its next commit will delete files on the laptop. The following script will do the trick:

#!/bin/sh                                                                      
cd $HOME

git push && ssh desktop 'git reset --merge `git rev-list --max-count=1 master`'

Of course passwordless ssh should be set up for this to work. A similar problem exists when pulling from the server: a "git checkout \*" must be performed to create any missing files.


Desktop: Add backup script and cron job

At this point, a backup script and cron job can be added to the desktop server. The directory ~/.git is all that needs to be backed up; a shell script can rsync it to a server.

2 comments:

  1. Hi thank you for your solution, the one amongst many, that finally worked for me.
    I also set up a independant bare repo to manage all of them and to work as origin.

    ReplyDelete
    Replies
    1. I eventually ended up doing the same -- things go much more smoothly if you use a bare repo on the desktop server end.

      I also moved away from using a single git repo for my home dir (too many things I don't want in a repo -- temp files, VMs, binaries, etc) and am instead using repos for certain subdirectories (config, doc, projects, notes, todo), with a shell script to run a commit+push on them in one fell swoop.

      This requires a bit more effort to set up, as one has to create a subdirectory for config files (e.g. .bashrc), move all config files into it, symlink from ~ to the config files, then finally make a repo to manage it.

      Once set up, though, it works pretty good.

      Delete