In actual fact, it would seem that the problem was sparse files after all.
I had quite a bit of trouble determining that this was actually the case though, I ended up hacking together two scripts to solve my problem, and without the second I think I would not have been able to solve the issue without erasing the entire destination disk and starting anew.
I first tried a diff, to see what differed from the source to the destination :
Code:
diff -rq /mnt/tmp/ /mnt/external/
Let it be said that a diff on more than a terabyte of data takes a very long time, I stopped this after about five hours.
Next, I made a script to determine if the backup files were of a different size from the source files (and to see what files were missing from the backup) :
Code:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import os
from os.path import join, getsize, exists
path1 = "/mnt/tmp/"
path2 = "/mnt/external/"
for root, dirs, files in os.walk(path1):
for file in files:
mirror_path = join(path2, root[len(path1):], file)
file_path = join(root, file)
if not exists(mirror_path):
print(file_path + " exists.")
print(mirror_path + " absent.")
else:
if not getsize(file_path) == getsize(mirror_path):
print(file_path + " size : " + str(getsize(file_path)))
print(file_path + " size : " + str(getsize(mirror_path)))
It seemed that the files all had the same size from the source to the destination, and that there were just a few missing, as there was no space for them. I next inverted path1 and path2 to check that there were no extra files in the backup - there weren't.
So, I made a new script to compare the number of filesystem blocks used in the source and destination partitions :
Code:
#!/usr/bin/env python
# -*- coding: UTF-8 -*-
import os
from os.path import join, getsize, exists
path1 = "/mnt/tmp/"
path2 = "/mnt/external/"
for root, dirs, files in os.walk(path1):
for file in files:
mirror_path = join(path2, root[len(path1):], file)
file_path = join(root, file)
if exists(mirror_path):
if not os.stat(file_path).st_blocks == os.stat(mirror_path).st_blocks:
print(file_path + " size : " + str(os.stat(file_path).st_blocks))
print(mirror_path + " size : " + str(os.stat(mirror_path).st_blocks))
It turns out that some files used up a lot more blocks in the backup ! Seems some files were sparse in the source, but not in the destination :-/
So I modified the last script to delete the offending files from the backup, I did another rsync, and presto, now the source and the backup are just about the same size !
Remarks :
1/ If you use the above code, beware that it seems to have a few issues with symlinks.
2/ I really feel that all this was overly complex. Shouldn't rdiff default to handling sparse files, or shouldn't adding the "--sparse" switch replace "regular" files in the destination with sparse files (this may not be trivial to implement mind you). At least mention sparse files and the woes they can cause in the rdiff docs...
3/ The script executes in under five minutes, a lot quicker than a full diff...
4/ I tend to ramble... maybe nobody is interested in my problems, maybe googleing this thread could help someone one day.