rsync is one good approach, because it is efficient. However, the end result is a current copy of what's on your original drive. If you want to recover a configuration file you had yesterday, before you broke your system, then you may be out of luck. One solution to that is snapshots. And, there just happens to be a solution using rsync to create snapshot like backups. http://www.mikerubel.org/computers/rsync_snapshots/
The disadvantage of that approach is that a failure of the backup drive loses all of that. So, make it raid. Or make it mirror. Or make multiple copies of it in some way. If you have two drives, you could rsync one, take it off site, and rsync the other the next day. Then let it run with rsync snapshots for a week. Then swap the drives and let the rsync snapshot procedure create an updated snapshot on the first drive. Then let it continue running rsync snapshots for a week. Then swap again. At some point, you might run out of space. Then you could start pruning older snapshots.
Another alternative is to go with a tape library. (you said this is a large government site, right? so budget shouldn't be a complete road block.) I found a relatively inexpensive (as tape libraries go) one -- the Sony LIB162 AIT5. It has 16 tape slots. Each tape holds 400G native, and might compress to more than double that depending on your data. Because it is a carasoul changer, it is a simpler mechanism than most, and runs about $5K. If you start looking at LTO4 robots with typically 24 slots or more, the prices are typically $10K or more. But that's all just ball park. You then also have to budget for tapes. The advantage is that you can then have a cycle with nightly backups, tapes going back, say, 6 weeks or more, and off site archival tapes. I use Amanda
to manage all that. Amanda has planner that works out dump strategies to smooth the backup over the entire dump cycle (say, a week), so that you don't have the huge resource hog of once a week full backups of everything, and then the backup system on semi idle the rest of the week just doing incrementals -- http://wiki.zmanda.com/index.php/FAQ...da_use_them%3F
. That's one of the main reasons I chose Amanda.
I actually like as much redundancy as I can manage. I have an external raid array that is managed by ZFS. It uses raidz2 (that's roughly equivalent to raid6) with a hot spare, so it has 9 data drives, 2 parity drives, and 1 hot spare. It would have to experience 4 drive failures to actually lose data. Using ZFS snapshots, I run a snapshot every night, and I keep those for the semester. In addition to that, I run a 6 week tape cycle, periodic archives, and cycle tapes off site. I also have some large radmind directories containing images that allow us to configure large numbers of lab and desktop computers easily and automatically. I use rsync to keep an up to date copy of that directory on another server in another building. I also have a cron daemon that does a remote copy of the Amanda configuration and index directories to a server in another building after the completion of each daily Amanda backup. So, gee, am I covered? hmm, I'm sure I can come up with something else I ought to be doing.
Just spend some time imagining what can go wrong. Then think about how you would recover from that. Then think some more.
If you're interested in digging deeper, check out the O'Reilly Backup and Recovery
book, and/or take a look at the companion web site http://www.backupcentral.com/