chacadwa.com

Technical blog and writings by Micah Webner.

Drupal rsync backup scripts

There are a lot of other backup scripts for Drupal on the web, but I promised somebody I'd post mine in response to this week's episode of the Geeks and God podcast. There are variations based on how the different servers are set up, but here's a basic summary of how I do it.

For this example, all backups are being done using rsync over ssh. This can be really simple to set up. An old Linux box in the basement with a couple of gigs of free space can be used as a backup for multiple sites.

Start by setting up the receiving machine. Install the rsync package if it's not already available. (I'm assuming you can handle setting up your router with port-forwarding to sshd. I'm not going to cover that here.) Create a regular unprivileged user to receive the backups. In that user's home directory, create a target directory for each backup, and create an rsync.conf file with an entry for each server you'll be backing up.

use chroot = false
[server1]
path = /home/user1/backups/server1
read only = false

Now create an ssh key pair to handle authentication. This key pair will only be used for backups, and won't require a password.

$ ssh-keygen -t dsa -b 1024 -N"" -f ~/.ssh/backup

The -N"" provides an empty password, and the files will be stored in the target user's .ssh folder.

Now cd to the .ssh folder and append the new public key to the authorized_keys file:

$ cd ~/.ssh
$ cat backup.pub >> authorized_keys

Now use the editor of your choice to add some code to the beginning of the last line of the authorized_keys file:

command="rsync --server --daemon ." ssh-dss AAAAB3...

This tells sshd to only allow this key pair to run the rsync program on your local backup machine.

Now copy the ~/.ssh/backup private key file to the web server. We'll specify it as the key that ssh should use to contact your backup host. On your server, create a script similar to the following:

#!/bin/bash
#
# Drupal Database Info
#
DBUSER="username"
DBPASS="password"
DBNAME="database"
#
# Server Subdirectories
#
WEBDOCS="/path/to/htmlfiles"
BACKUPS="${HOME}/backups"
#
# Backup Server User and Hostname
#
BKUSER="username"
BKHOST="hostname"
MYNAME="server1"
#
# Rsync and Communications parameters
#
PARAMS="-azxC --exclude-from=${HOME}/backup/excludes --delete-excluded --stats"
CPARMS="--bwlimit=32 --timeout=1800"
RSH="ssh -c blowfish -i ${HOME}/.ssh/backup -l ${BKUSER}"
#
# echo "*** Performing Database Dumps ***"
#
/usr/bin/mysqldump -u ${DBUSER} -p ${DBPASS} --opt ${DBNAME}|
gzip > ${BACKUPS}/${DBNAME}.sql.gz
#
# echo "*** Performing Remote Rsync ***"
#
rsync $* ${PARAMS} ${CPARMS} --rsh="${RSH}" ${WEBDOCS} ${BACKUPS} ${BKHOST}::${MYNAME}

Obviously, you'll need too tweak all of the parameters. In this example, MYNAME=server1 corresponds to the [server1] section in rsync.conf on the target machine.

The extra command line parameters passed on the rsync line mean you can run this script with "-v --progress" appended to the command line and rsync will be very verbose during testing.

The excludes file contains files or directories that should not be backed up. This would typically include your Drupal tmp directory (if within the backup area) and things of that nature. Here's an example from a site that also has Gallery2 installed:

httpdocs/sites/default/gallery.data/cache/
httpdocs/sites/default/gallery.data/sessions/
httpdocs/sites/default/gallery.data/tmp/
httpdocs/sites/default/gallery.data/smarty/
httpdocs/sites/all/tmp/

Add this script to cron on your web server, and it will back up regularly.

35 0 * * * /path/to/script 2>&1|mail -s "Backup Results" user@example.com

There are some shortcomings in this script. It only keeps one backup of your site. If the site is hacked or otherwise damaged, and you don't catch it before the backup runs, you'll write the bad data over your good backup. I'm pondering the best way to address this.

Topics: