After data loss is before data loss: How to make correct backups

This blog post was published 8 years ago and may or may not have aged well. While reading please keep in mind that it may no longer be accurate or even relevant.

Why should you regularly make backups? Because if you don’t, then this mistake will bite you, sooner or later. Why? Because of Murphy’s Law:

Anything that can go wrong, will go wrong.

And a variation of it, Finagle’s law, even says:

Anything that can go wrong, will -— at the worst possible moment.

So, let’s prepare right now and look at ways to back up data correctly.

RAID data mirroring is not enough

Realtime data mirroring (no matter if it is software or hardware RAID) is good, but not enough. What if your location is hit by lightning, fire or water? What if your entire system gets stolen? Then RAID would have been useless.

Threats to local backups on external media

Say that you have an external USB hard drive for you backups. This is good, but as long as it is connected to your computer, it still may be subject to destruction due to lightning.

In addition, if you leave your external USB hard drive mounted in your host OS, and if you make a mistake as an administrator, or have faulty software or malware, you may fully erase your main hard drive and the backup at the same time. This is not too unlikely!

It happened to me once. A simple mistyped rm -rf ./ as root user somewhere deep in the filesystem did exactly that (I accidentally typed a space between the dot and the slash). Yes, I erased my main hard drive and the backup (mounted under /mnt) at the same time. The data loss was disastrous.

Independent of the above, local backps are still susceptible to fire, water, or theft.

The dangers of deleted or changed files

rsync is especially good if you transfer the data to your backup location via public networks, because it only transfers changes. It also supports the --delete flag which deletes remote files when they are no longer locally present. This is generally a good idea if you want your backup to be an exact copy, otherwise your backup will become messy by accumulating many deleted files, which will make restoration not very fun.

But the --delete flag is also a danger. Say you delete an important file locally. Two days later you discover this fact, and decide to restore it from your backup. But guess what, it will be gone there too if you have synced in the meantime.

This problem is also present when changing files. The only solution to this problem is to have rolling backups (backups of your backups) in regular or increasing intervals (weekly, monthly, yearly). This will multiply the storage requirements, but you really cannot get around it.

Restoration is as important as backing up

Let’s say you have 10 perfectly done backups. But if you can’t access them any more, or not quickly enough (e.g. due to low bandwidth, etc.) they will be useless for your purposes. You need to put as much thought and effort into an effective restoration method as you put into the backups in the first place.

What works?

In general, a good backup solution depends on the specific circumstances and needs. Backups can never be perfect (100.0% reliable), there will always be a small but nevertheless existing possibility of total data loss. But you can make that possibility very, very small. As a rough guideline, the following principles seem to minimize the risk:

  • You have more than one backup.
  • You have backups of your backups (”rolling backups”).
  • You do not leave local backup media connected or mounted.
  • Your backups are at geographically different locations to compensate for local desasters.
  • If your backup is at a remote location, you fully trust the location, or use proper encryption.
  • Restoration is effective.
  • Backup and restoration is automated and tested.
  • After each backup cycle, the backups are verified. If there was a failure, the administrator is notified.

It is your responsibility!

If you should lose data, don’t blame it on ‘evil’ external circumstances, because:

Never attribute to malice that which is adequately explained by stupidity.

What if all data are still lost? Well, in this case I only can say:

Every misfortune is a blessing in disguise.

Start working on your backup solution now!

If you found a mistake in this blog post, or would like to suggest an improvement to this blog post, please me an e-mail to michael@franzl.name; as subject please use the prefix "Comment to blog post" and append the post title.
 
Copyright © 2023 Michael Franzl