I am responsible for backing up a little less than a TB of data for a few people, centrally stored on machine A. So far my backup plan has been using rsync to copy the data over the network to a removable drive on machine B every so often (it's a rather static environment).
It's worked fine so far, however I must admit having very paranoid tendencies and always fear that safety/backup mechanisms will fail when I need them most
As such, I would like to know a simple and unobtrusive way to test the integrity of the data so I know that what is on the removable drive is a bit-for-bit copy of what's on the central machine.
So far I have tried using df to count the number of blocks used on the central machine's data partitions and compare it to the blocks used on the backup drive. The numbers differ but are pretty close. Somehow I get the feeling this is not an accurate way to compare-- especially since both drives are using different filesystems (main is on reiser, backup is ext3).
Furthermore this is ~1TB of other peoples' data so it would not be practical or ethical for me to try to manually examine every single file to make sure it is intact.
I know md5sum is a way to check the integrity of a single file, but judging from the man page it doesn't have a recursive option or any way to compare hashes. I found this Windows program
which seems to be able to scan recursively and compare. Unfortunately it does me no good here :P
Any tips? How do you manage to sleep at night when you consider your backup scheme?