LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   Verifying Rsync Backups of Large Volumes of Files (https://www.linuxquestions.org/questions/linux-server-73/verifying-rsync-backups-of-large-volumes-of-files-646814/)

mcgirvanmedia 06-03-2008 08:53 PM

Verifying Rsync Backups of Large Volumes of Files
 
Hi all, I'm having an issue with backing up my file server and verifying that Rsync is doing its job. I have just starting using Rsync to backup about 150gig of files (audio and visual data mainly) to another server over the gigabit network automatically every night. I was new to using Rsync, so I tried to verify whether or not this was creating a proper backup, and used du to compare sizes, but I'm finding a difference between the sizes. I can't figure out what is causing this. The same exact same number of files exist on each drive - although the backup is about 460mb larger (after last night - earlier this week it was 1gig larger).

I'm using rsync as follows:

rsync -az --delete -e ssh root@***.***.***.***:/source /destination >> $log

I can't find any issues with the command I'm using (I'm using the --delete option, so it can't be a failure of deletes to follow through from source to destination). It's pretty much impossible to compare the two directory listings between source and destination by hand (there are hundreds of thousands of files to compare). So why are the disk usage results different? The only possible explanation I can think of would be a difference between hard drives, but even then they're both using the same FS with the same block size (and they're the same capacity - although different manufacturers).

I was considering writing a script that compared line-by-line the outputs of ls, however sometimes the order of files/directories outputted by ls-lR are different between the two drives (even though the same files are in each dir), which may create false-positives.

Any ideas how I can compare the two and find out why my backups seem to be slightly larger?

irishbitte 06-03-2008 10:02 PM

interesting one, i have come across it before. It generally happens when the --delete option is not invoked on every backup? at least thats my experience, could be worth SFA!

bryanl 06-03-2008 11:30 PM

it may be a matter of file fragmentation or links

I use a snapshot approach - see rsync snapshot backups using cp -al and a bit of renaming. I had to use the modify-window option to handle an CIFS problem.

If you are thinking of writing scripts, I'd consider calculating md5 sums. Remasterys does this with
Code:

find . -type f -print0 | xargs -0 md5sum > md5sum.txt
(that's a script to create a bootable DVD backup for an Ubuntu system). You can run this on both systems and then analyze the resulting md5sum lists for errors.


All times are GMT -5. The time now is 10:07 AM.