View Full Version : Backup policy: how paranoid is too paranoid?

5th September 2007, 11:08 AM
At DanielWebb.us (http://danielwebb.us/software/backup/), Mr. Webb mentions that he uses Subversion for user files. ... Not everything, but "Important user files" such as what he's working on & email.

Is that too paranoid?? .. and is the "if the house burns down" thing something most people consider? (.. hmm, gmail vfs)

It got me thinking; if there was transparent (or at least non-clunky) way to use a revision control system, maybe it's a good idea. I have a painful history of clobbering files at the command line by going too fast with mv and bash tab-completion. :eek: Does anyone else use Subversion as part of their backup policy, like Mr. Webb? I don't know much about it, especially for emails..


I currently do backups manually, which is non-optimal, because I'm the type of guy who forgets his passport when leaving for an international flight. (In other words, "powerful ignorant.") If anyone has clever ideas or observatioins about backup policy, please chime in:

The hardware:
FC7 number-cruncher box, with single 300gb high-speed drive
Partitioning, Physical: 100mb boot, rest = LVM
Logical Volumes: swap, /, /home (home is largest & where data is kept)
DVD burner

Dedicated internal-network backup server NAS, mounted on the FC7 machine using CIFS, RAID-5 750gb

.. and, in the style of Webb's Goals section,
1 day recovery from physical hard drive failure (1 day after drive replacement)
Optional (If not over-kill): recovery of command-line-flub clobbering
if it helps, I'm willing to accept a time limit, for example a SE (Stupidity Event) buffer that empties itself after 30 minutes.

Recovery of command-shell SE clobbering of day-old files that happened up to 5 days previous
2 year archive of old, probably trash data, on the off chance we decide it's important
nice'd jobs that take minimal processor power in the event that all the processors are crunching numbers at the designated backup time

My work: a minimal amount of code development, and a moderate amount of scientific data generation, manipulation, and visualization.

Mr. Webb mentions using LVM snapshots, but I don't really get when/where that's important, or even how to use them in a shell script. Still learning that.

About #1 & #3: Maybe a combination of rdiff-backup, tar, and monthly dd, or is rdiff-backup & dd alone good enough? Considering #2, what's the best way to organize what doesn't get a frequent backup?

About #4: DVD seems a good option, but is there a way to automate that? Compressed, incremental TAR seems appropriate too.

5th September 2007, 04:28 PM
Personally I use a tar backup script every 24 hours to create monthly incremental backups of home dirs and MySQL databases, for home I dont use off-site backups, as I do think this is a little OTT for what I need.

I do however have my home directory disks RAID-0 for redundancy, and the backup is to protect against accidental file deletion.

I will be setting up a server soon for my Mum's company, and that will have offsite backups to my home server (on the simple basis their warehouse got flooded over the summer).