rsync uses wrong directory, has wrong size but right file count?
Hi,
I am setting up a poor man's RAID 1 between 2 drive subsystems on my main workstation (no network involved). My command was:
rsync --archive --verbose /r5/pic /tera/rsync/pic > pic.rsync.log
I expected it to take my /r5/pic directory as master and the other one as the slave and make whatever changes were necessary to the slave to make it EXACTLY mirror the master. I used archive mode to keep my permissions, times, etc.
What it did was to create /tera/rsync/pic/PIC directory and then dump LESS THAN EVERYTHING there. Dolphin shows me a (slightly) differ byte size with fewer bytes on the slave? But, it got the file count right. I was not monkeying with either directory at the time.
Dolphin shows 111,534,884,400 bytes on /r5
Dolphin shows 111,533,884,976 bytes on /tera
godzilla2:/home/brianp/ # /usr/bin/du -s /r5/pic
109554180 /r5/pic
godzilla2:/home/brianp/ # /usr/bin/du -s /tera/rsync/pic/pic
109553184 /tera/rsync/pic/pic
The file count is identical, 95,053 files and 735 sub-folders. The question is, where is the lost megabyte and is it in 1 truncated file or are there short counts in many corrupt slave files? A single byte missing from a jpg can trash the entire file. I need an exact mirroring.
Other than calculating an MD5 on every file in the 111 GB set times two and comparing them, is there another way to actually mirror my slave to my master? I want to be able to do it in a cron job so a doing a full, Gargantuan copy is more than necessary (I hope).
I "presume" that the file size reported by both du and dolphin is the actual file size not counting slack space because of different "cluster" (?) sizes in the formats. /r5 is an ext3 and /tera is an ext2. /r5 is also a hardware RAID 5 on a 3Ware 9650se controller. /tera is a hitachi deathstar 1tb drive.
Is it possible that I could have "sparse" files with holes in the middle? That would be hard to believe with .jpg image files.
Finally, for the MASTER, you give it the actual directory to clone. For the SLAVE, you give it the PARENT to the directory to be synchronized. Is that a hair-brained (microsoftian) interface or am I reading it wrong?
Thank you,
BrianP
|