LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Server (https://www.linuxquestions.org/questions/linux-server-73/)
-   -   how do i verify rsync (https://www.linuxquestions.org/questions/linux-server-73/how-do-i-verify-rsync-868783/)

wjtaylor 03-15-2011 04:16 PM

how do i verify rsync
 
I've been playing with rsync. I've copied some data from one drive to another, I don't see any errors, but when I inspect with du -s I get two different values

root@T/home/user/Desktop# du -s backups
759421964 backups
root@T:/home/user/Desktop# du -s /data/tmp.moved/
759568528 /data/tmp.moved/
root@T:/home/user/Desktop#

How can I see what's different? I've been playing with diff and find/ls, but it's not as nice/complete as I'd like.

I did the following as root:

cd sourcedir
ls -aR > ~user/original
cd backupdir
ls -aR > ~user/backup
cd ~user
diff -yW 200 original backup > diff.txt

grep \> diff.txt
grep \< diff.txt
grep \| diff.txt

all came back null

Doing the above with ls -laR got messy...

Note: the source is an XFS filesystem (on a RAID5 array, if that matters), the backup filesystem is an EXT4 filesystem.

Could the difference in size be due to the difference in filesystem architecture?

The original rsync command was:
rsync -aHvPh --delete /data/tmp.moved/ ~user/backups

hmm.... I just realized I left the trailing slash off the end of the backups directory... not sure if that would cause difference in disk usage, though... the diff of the ls -aR showed no differences.

Any ideas?

Thanks,
WT

Omnicronos 03-15-2011 04:43 PM

Each file system will indeed allocate the data differently, most likely caused by how they handle journaling. Another potential cause is your source directories have expanded and retracted in size. They will still have space allocated in the directories for the deleted file names even if the directories are empty. Since rsync rebuilds the directories from scratch during the copying process, the new directories on the target no longer have that space allocated for missing files.

wjtaylor 03-15-2011 05:12 PM

Quote:

Originally Posted by Omnicronos (Post 4291919)
Each file system will indeed allocate the data differently, most likely caused by how they handle journaling. Another potential cause is your source directories have expanded and retracted in size. They will still have space allocated in the directories for the deleted file names even if the directories are empty. Since rsync rebuilds the directories from scratch during the copying process, the new directories on the target no longer have that space allocated for missing files.

So, would you say that the percentage difference in the disk usage is reasonable and not a concern given your above comment? Don't worry you're not liable. :)

WT

Weird0ne 03-15-2011 05:23 PM

I agree with Omnicronos that it's most likely just the difference in filesystems. But if you wanted to be sure you could make a checksum before the transfer, and then verify against it afterwards.

Omnicronos 03-15-2011 06:33 PM

Quote:

Originally Posted by wjtaylor (Post 4291932)
So, would you say that the percentage difference in the disk usage is reasonable and not a concern given your above comment? Don't worry your not liable. :)

WT

No worries. I doubt you are experiencing any data loss. But, as Weird0ne wisely suggested, it never hurts to verify your data with a checksum.

Quote:

rsync --checksum: Normally rsync compares the timestamp and the size of a file to determine if it has changed since the last backup. If you use --checksum rsync will ignore the time stamps and checksum any files that are the same size to determine if they are different. Obviously this adds a significant slowdown to the backup process. You wouldn't normally use this option however it is good to have if you believe your backup data has become corrupted in a way that doesn't affect the information you see in an ls -l output.
http://www.sanitarium.net/golug/rsync_backups_2010.html


All times are GMT -5. The time now is 02:53 PM.