Backups using rsync and ssh

I have a server (which just handed you this web page, in fact) located in downtown Orlando, Florida. I also live in the Orlando area, but the server is not in my house. For the most part, the server just sits in the rack, serving up web pages and handling email, without any problems at all.

However, as we all know, sometimes bad things happen. Within the last year, I have had a hard drive physically go bad in the server (i.e. it wouldn't spin at all), and I've had the air conditioning fail at the facility in Melbourne where my server was until a month ago, which luckily my server survived without any permanent damage, but which could have been a catastrophic failure- when I couldn't reach the server, and couldn't reach the colo provider on the phone, I drove down to Melbourne and found the temperature in the room to be about 95F (or 35C), with the windows open and a four-foot box fan trying to pull the heat out of the room (but not having much luck, since there was no air "input" on the other side of the building.)

To prevent this from being a "total loss" event, I have a machine at my house (a "Buffalo Linkstation" NAS device, which has been "hacked" and is now running Debian) which runs a cron job every four hours, which uses ssh to connect to the server, and rsync to copy any files which have changed since the previous backup.

And by running multiple instances of this cron job, I am able to back up multiple servers on the same "backup server".


Background

The way it works is this:


Creating the ssh key

The first step is to create an ssh key pair. When the script connects to the server, it will use this key to authenticate instead of using a password. This will allow it to access the server without somebody having to type in a password every time the backup runs.

Creating the key pair involves using the "ssh-keygen" command on the backup server. The process looks like this:

# cd ~/.ssh
# ssh-keygen -t dsa -b 1024 -f id_dsa_backup -C 'rsync backups'
Generating public/private dsa key pair.
Enter passphrase (empty for no passphrase): Just hit ENTER.
Enter same passphrase again: Again, just hit ENTER.
Your identification has been saved in id_dsa_backup.
Your public key has been saved in id_dsa_backup.pub.
The key fingerprint is:
08:35:3d:bb:94:bf:71:fe:8d:7e:cd:23:52:4f:4b:5a rsync backups The fingerprint will be different.

As you can see, we are creating a key with no passphrase. You will need to make sure that the "id_dsa_backup" file (or whatever you named it) is not allowed to fall into the hands of anybody who shouldn't have root-level access to the backup server AND to the servers you're backing up.


Installing the key on the server

Configuring sshd

Before we install the key itself, we need to make sure that sshd on the server (i.e. the machine from which we're pulling the backup) is configured to allow key-based authentication.

On the server, find your "sshd_config" file. This will usually be in an "/etc/ssh" directory, although some systems use "/etc" instead. We need to check the following lines in the file:

If you had to change any of the options in the "sshd_config" file, you should restart your sshd process. This is normally done using a command like "service sshd restart" or "/etc/init.d/sshd restart".

Installing the key and forced command

The next step is to install the public key in root's ".ssh/authorized_keys" file on the server. This tells sshd that the corresponding private key is allowed to log in as root. In addition, we will also restrict that key's rights to running one single specific command.

The first step is to copy the "id_dsa_backup.pub" file to the server. This can be done with any mechanism you like- you can write it to a USB stick or a floppy on the backup server, and then read it on the main server; you can use FTP or scp to transfer it, you can even cat the file on the backup server, and copy/paste it into the main server (as long as you're careful to ensure that the file is not modified in transit- it's supposed to be one very long line of text.)

I'll leave the mechanics of copying the file up to you. Just make sure that you ONLY copy the "id_dsa_backup.pub" file. You DO NOT NEED the "id_dsa_backup" file (the secret key) on the servers from which you will be pulling data, and probably should not have them there at all.

Once the file is copied to the main server (the one from which you will be pulling the backups), move it to root's ".ssh" directory. Then, we could add it to the "authorized_keys" file, with a command like "cat id_dsa_backup.pub >> authorized_keys", however that would give the key full root access to the server- meaning that not only could it be used to pull backups, but it could be used to get a shell with root access as well.

Obviously this isn't a very good idea- a private key without a passphrase having unrestricted access to a root shell is very unsafe. If somebody ever got a copy of the secret key file, they would have full root-shell access to the server.

We can prevent this by attaching a "forced command" to the key, so that when the key is used to authenticate, it will always run the forced command, and cannot be used to run any other command.

Of course, in order to do this, we need a suitable command which can be attached to the key. And as it happens, I've already written one, which I call "allow-backup". You are welcome to use it, or of course you can write your own script which does the same thing.

File: allow-backup
Size: 2,585 bytes
Date: 2008-01-13 04:42:41 +0000
MD5: e6e3d1eb2f198cbd55421fae6549cd5c
SHA-1: 842cfcf265f5affd36dbd4e215d8290d7400565d
RIPEMD-160: 8c33ee2546ee17a3acd9a17320653851301cf928
PGP Signature: allow-backup.asc

On my server, I have saved this script (with the appropriate addresses for emailing my cell phone if needed) as "/root/bin/allow-backup". To attach the command to the key, I first built a work file with the forced command attached to the beginning of the public key:

# cd ~/.ssh
# /bin/echo -n 'command="/root/bin/allow-backup" ' > z
# cat id_dsa_backup.pub >> z
# cat z
command="/root/bin/allow-backup" ssh-dss AAAAB3NzaC1kc3MAAACBAP...AIcn1bnVulBbkdAEZhen rsync backups

The "z file should be one really long line of text. You can verify using this command:

# wc -l z
1 z

Once you have verified that the "z" file is correct, you can add it to the "authorized_keys" file using this command:

# cat z >> authorized_keys The name "authorized_keys" may be different on your system. This is why I told you to check the "AuthorizedKeysFile" line in the sshd_config file.

Now, anytime somebody logs into the server and uses that key to authenticate, the server will run that script instead of whatever command they were trying to run.

Oh, and you can delete the "z" file, it is no longer needed.


Installing the script on the backup server

The last part, of course, is the script on the backup server which pulls the data from the main server(s).

File: backup-servers
Size: 3,227 bytes
Date: 2008-01-13 05:40:26 +0000
MD5: 9254aec07899881b1baa98622bf60e43
SHA-1: 54c3eb9bf9dce945e751fe4e7d024f27aa670427
RIPEMD-160: 184de07f424638e07684e1f5b51ca77594e0a6e1
PGP Signature: backup-servers.asc

Again, on my backup server, this script is installed in the "/root/bin" directory. Of course it has the actual server names, and I have a disk mounted as "/backup" which is the repository for the backups. You may need to customize this- follow the BACKUPDIR and TARGET variables.

I'm also limiting the bandwidth used by the backup process to 1Mbit (which is 2/3 of a T-1 line.) This is done using the "--bwlimit=" parameter on the actual rsync command line. You may wish to change that, or remove it altogether.

Of course you'll need to customize the SERVERS variable. It lists the server names you will be backing up. For example, the actual script on my backup box has four server names in it. Just list the names, one after another, separated by spaces.

You may also wish to modify the list of files or directories which will be excluded from the backup. This is the EXCLUDES variable. You will see how to build a longer list without making one line of text fall off the edge of the screen.


Running the script

The first time you back up a server, it will be copying ALL files on the server. This means, depending on how much data is involved, it will probably take a long time to finish. You may want to run the script with the "-v" parameter (i.e. "./backup-servers -v") so that you can see the filenames as they're being copied.

Once you have the script configured the way you need it, you will probably want to set up a cron job to run it on a regular basis. How often you run it will depend on your needs- I pull backups from my own server every four hours, and from my clients' servers once a day.


Notes


Contributions