Backup And Archiving At Home

I have several computers at home, and it is important that they are properly backed up in order to not lose data. I want to show an example of how this is done, but first a number of preliminaries.

I have defined that backups should, where possible, be placed on a different disk to the source. Thus I should not lose data if I have a disk corruption or a hardware failure. There are certain directories (for example /etc, and the subdirectoy mydocs in my home directory) which am changing the files and would like to keep changes to those files so that I can revert, or insure that when I delete them a copy is archived for posterity. I break down my file layout into separate filesystems, and in particular, I have separated out:- the backup directory (well it is on another disk) my home directory certain directories (particularly on my server) which are likely to contain massive amounts of data (such as /var/lib/svn where all the svn repositories lie) Where possible I am using lvm to manage most partitions as logical volumes, so creation, deletion and resizing of them is easy. Once a file changes in one of the special directires (such as /etc), the copied file is stored in on of several snapshot directories related to points back in time. I have the latest snapshots daily snapshots from yesterday – up until one week old weekly snapshots up to one month old monthly snapshots up to 6 months old older than six months are assumed to be queueing for eventual manual writing to CD for keeping for ever. So how do I do it.

Firstly, simple backup is done using rsync with the -aHxq and –delete switches. This cause the destination directory (and subdirecties) to become a copy (ie a backup) of source directory (and subdirectories). The -x switch limites this to a single filesystem. Where I need to keep the changes to a specific directory then I also use the –backup-dir switch to write them into the latest snapshots directory.

Archiving the snapshot directory is done daily just before the backup (so its actually part of a daily backup script that is run creating the script file as /etc/cron.daily/backup). This snapshot is turned into the daily snapshot by simply using mv to change the name of the directory from snap to daily.0 (or course daily.0 should have already been renamed to daily.1 before hand). Similar backup scripts for archiving only are placed in /etc/cron.weekly and /etc/cron.monthly)

The interesting trick comes when merging a daily snapshot into an already existing weekly snapshot (or weekly into monthly, or monthly into the CD archive). By using cp -alf this just makes an additional link in the weekly snapshot to the file already in the daily snapshot (so it happens fast as there is not file copying). Where a file already existing in the weekly snapshot it is replaced by the link (this effectively overwriting the old version), where a file didn’t already exist a new link is simply created. If the old daily snapshot is removed at this point, then this just unlinks the file from the daily snapshot but leaves it in the weekly.

So here is the relevent code from the files

/etc/cron.daily/backup

#!/bin/sh

logger -t "Backup:" "Daily backup started"
ARCH=/bak/archive

if [ -d $ARCH/daily.6 ] ; then
if [ ! -d $ARCH/weekly.1 ] ; then mkdir -p $ARCH/weekly.1 ; fi
# Now merge in stuff here with what might already be there using hard links
cp -alf $ARCH/daily.6/* $ARCH/weekly.1
# Finally loose the rest
rm -rf $ARCH/daily.6 ;

fi
# Shift along snapshots
if [ -d $ARCH/daily.5 ] ; then mv $ARCH/daily.5 $ARCH/daily.6 ; fi
if [ -d $ARCH/daily.4 ] ; then mv $ARCH/daily.4 $ARCH/daily.5 ; fi
if [ -d $ARCH/daily.3 ] ; then mv $ARCH/daily.3 $ARCH/daily.4 ; fi
if [ -d $ARCH/daily.2 ] ; then mv $ARCH/daily.2 $ARCH/daily.3 ; fi
if [ -d $ARCH/daily.1 ] ; then mv $ARCH/daily.1 $ARCH/daily.2 ; fi
if [ -d $ARCH/snap ] ; then mv $ARCH/snap $ARCH/daily.1 ; fi

# Collect new snapshot archive stuff doing daily backup on the way

mkdir -p $ARCH/snap
…

/etc/cron.weekly/backup

#!/bin/sh
# AKC – see below for history

ARCH=/bak/archive
if [ -d $ARCH/weekly.5 ] ; then
# if any of the files only have one hard link, it needs to be passed on
if [ ! -d $ARCH/monthly.1 ] ; then mkdir -p $ARCH/monthly.1 ; fi
# Merge into monthly archive
cp -alf $ARCH/weekly.5/* $ARCH/monthly.1
# Shift along snapshots
rm -rf $ARCH/weekly.5
fi

if [ -d $ARCH/weekly.4 ] ; then mv $ARCH/weekly.4 $ARCH/weekly.5 ; fi
if [ -d $ARCH/weekly.3 ] ; then mv $ARCH/weekly.3 $ARCH/weekly.4 ; fi
if [ -d $ARCH/weekly.2 ] ; then mv $ARCH/weekly.2 $ARCH/weekly.3 ; fi
if [ -d $ARCH/weekly.1 ] ; then mv $ARCH/weekly.1 $ARCH/weekly.2 ; fi

/etc/cron.monthly/backup

#!/bin/sh
# AKC – see below for history

ARCH=/bak/archive
CDARCH=/bak/archive/CDarch-`date +%Y`
MACH=piglet

if [ -d $ARCH/monthly.6 ] ; then

if [ ! -d $CDARCH ] ; then mkdir -p $CDARCH ; fi
cp -alf $ARCH/monthly.6/* $CDARCH

rm -rf $ARCH/monthly.6
fi

# Shift along snapshots

if [ -d $ARCH/monthly.5 ] ; then mv $ARCH/monthly.5 $ARCH/monthly.6 ; fi
if [ -d $ARCH/monthly.4 ] ; then mv $ARCH/monthly.4 $ARCH/monthly.5 ; fi
if [ -d $ARCH/monthly.3 ] ; then mv $ARCH/monthly.3 $ARCH/monthly.4 ; fi
if [ -d $ARCH/monthly.2 ] ; then mv $ARCH/monthly.2 $ARCH/monthly.3 ; fi
if [ -d $ARCH/monthly.1 ] ; then mv $ARCH/monthly.1 $ARCH/monthly.2 ; fi

UPDATE: As of 26th February 2011 the basic mechanisms show in this post are still in use. However some detail is wrong (this disk layout and partitions). Nothing that detracts from the basic message. See also my recent post about keeping personal data backed up