Testing integrity of backups?
Hello,
I am responsible for backing up a little less than a TB of data for a few people, centrally stored on machine A. So far my backup plan has been using rsync to copy the data over the network to a removable drive on machine B every so often (it's a rather static environment). It's worked fine so far, however I must admit having very paranoid tendencies and always fear that safety/backup mechanisms will fail when I need them most ;) As such, I would like to know a simple and unobtrusive way to test the integrity of the data so I know that what is on the removable drive is a bit-for-bit copy of what's on the central machine. So far I have tried using df to count the number of blocks used on the central machine's data partitions and compare it to the blocks used on the backup drive. The numbers differ but are pretty close. Somehow I get the feeling this is not an accurate way to compare-- especially since both drives are using different filesystems (main is on reiser, backup is ext3). Furthermore this is ~1TB of other peoples' data so it would not be practical or ethical for me to try to manually examine every single file to make sure it is intact. I know md5sum is a way to check the integrity of a single file, but judging from the man page it doesn't have a recursive option or any way to compare hashes. I found this Windows program which seems to be able to scan recursively and compare. Unfortunately it does me no good here :P Any tips? How do you manage to sleep at night when you consider your backup scheme? :) Thanks |
how about something along the lines of
Code:
find /original -type f -exec md5sum {} \; >/tmp/md5sums.orig Code:
md5sum /tmp/md5sums.* Code:
diff /tmp/md5sums.orig /tmp/md5sums.copy Tweak to taste. edit: forgot to add, if you don't want the processing overhead of running md5sum against everything, then replacing md5sum with some invocation of stat such as 'stat -c '%n %s' in the 2 find commands should work reasonably well, but obviously won't be as thorough a check. |
According to the man page, rsync does a checksum after its copied each file, separate from the checksum it does to see if the file needs transferring:
Quote:
|
All times are GMT -5. The time now is 01:27 PM. |