Linux - SoftwareThis forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
is there a way i can create a tar file with high data integrity, i want to back up some important stuff and burn it to a CD but i don't really feel 100% of the data will safe, i would like to sacrifice some space and add extra parity to the tar file, is there a way to do that?
While this is true it won't help him if the CD with the archive gets scratched
while in storage ...
Cheers,
Tink
All righty then. The original question appeared to me to be asking about verifying a backup that you are creating, but let's go with verifying a backup from storage.
I would say that you would have to prepare a checksum for your backup at the time that you make it. Then you can run the same checksum operation when you take the backup medium out of storage. It would go something like this.
Prepare your backup for storage:
1) Make your backup.
2) Ensure that you are happy with it.
3) Obtain an MD5 checksum on the storage medium.
4) Write that checksum by hand on the storage medium.
Verify medium from storage:
1) Obtain an MD5 checksum on the data on the medium.
2) See if that MD5 checksum matches the one that was made at the time that the backup was done.
Here is how I would implement it.
Create the backup. Let's use the following conditions. We are going to create a tar archive file on a disk partition mounted on /mnt/backup. The tar file will be named test.tar.
Code:
root> tar -c --verify -vf /mnt/backup/test.tar .
Okay so I just executed that and it worked. The tar file was created and it was verified. Now let's create an MD5 checksum for this known good file.
Now write that MD5 checksum onto the backup medium.
When you retrieve the backup medium from storage you just run an MD5 checksum on the backup image and compare it to the one that is written on the medium.
You can play this game with any medium whether it be tape, CD-RW, DVD-RW or EEPROM, or whatever. You could make the tar file on a hard disk and then use something like K3B to burn that file onto a CD or DVD which could then be mounted like a disk. You could then do a MD5 sum on the CD or DVD image to see if it matched the tar file MD5 sum on the hard disk. That would validate the burn onto CD or DVD. Actually K3B will do this for you if you ask it nicely.
Funny thing. I'm sure that I used to use tar to stream directly onto a DVD-RW. just like a tape drive. I just tried it and it wouldn't work. The only difference that I can think of is that I don't use IDE=SCSI in the kernel any longer. That's why my example above uses a tar file on a hard disk partition that can be mounted to a normal mount point. The procedure will certainly work if you send your tar stream directly to a tape drive.
Anyway, that's my idea for verifying backup media out of storage, which I didn't think that the original question even asked about.
Last edited by stress_junkie; 09-09-2006 at 07:35 PM.
getting the MD5SUM of the tarball is nice, but having a correct MD5SUM of a corrupt tarball is a very real possiblity... for this reason, i would recommend taking it a step further by aperforming an MD5SUM of every single file which is gonna go into the tarball (in addition to getting an MD5 of the tarball after)... this way when you untar the tarball, you can check the actual files you care about for integrity...
let's say we are in the directory which contains everything we wanna back-up:
Code:
find . -type f -exec md5sum {} >> CHECKSUMS.md5 \;
now our backup directory will contain a checksum file which will get tarballed along with everything else... to verify the integrity of the files after untaring, just cd to the base dir and do a:
getting the MD5SUM of the tarball is nice, but having a correct MD5SUM of a corrupt tarball is a very real possiblity
We know that the tar file is good at the time that it is created because we used the --verify option in tar.
I have to admit that I like your idea of getting a checksum on every file to be archived. My procedure only lets us know if the tar file is bad from damage to the medium during storage. Your procedure lets us know which files are still good even if the tar file suffered some degradation.
Last edited by stress_junkie; 09-09-2006 at 07:58 PM.
We know that the tar file is good because we used the --verify option in tar.
hehe, sorry, i don't know how i missed that...
Quote:
I have to admit that I like your idea of getting a checksum on every file to be archived. My procedure only lets us know if the tar file is bad. Your procedure lets us know which files are still good even if the tar file suffered some degradation.
yeah, i think mixing both procedures would be great...
While this is true it won't help him if the CD with the archive gets scratched while in storage ...
Maybe check out dvdisaster: "dvdisaster provides a margin of safety against data loss on CD and DVD media caused by aging or scratches. (..) dvdisaster is available for recent versions of the FreeBSD, Linux and Windows operating systems."
Verifying good backups and good CD burns is necessary, but so is a test restore IMHO. I think tar/Linux is quite secure in making good backups, but I can't say the same for the Windows world. I know - you're not talking about the Windows world. However, I have personally experienced, and known others who have too, a "good, verified" backup that couldn't be restored at a later date (in the Windows world). So that lesson learned has been carried over even after I switched to Linux. Important stuff always gets a test restore, and that restore is done on a different computer. Using different hardware. It doesn't matter if the CD burner that burned the backup can read its own CD later, if that particular CD drive goes bad on you. You need to make sure that other CD drives can read your burned disk reliably and error-free as well.
And as Tink said, don't trust the media. CD's can go bad. Burn two copies. If you really want to be paranoid, use two different branded media (high quality) and two different CD burners if you have them. Then next year, burn two more (but still hang on to the old ones). And on and on. Put one copy in you bank safe-deposit box, and mail the other one to a relative or friend in a different geographic location.
don't forget to encrypt the files on the CD that you mail your relative or friend in a different geographic location (you'll need a separate set of test restores)...
This is how I do encrypted backups. Actually I put tar backup files onto encrypted partitions on external USB drives, but I'm going to describe how to create an encrypted container file to hold a backup. Then the container file can be copied to some storage medium.
1) Create a container file the size of the backup medium. You only have to do this once.
2) Mount the container file to a loop device using encryption.
Code:
losetup -e blowfish /dev/loop0 /var/backup/container.file
password: <enter your encryption key-password>
3) Create your file system through the loop device into the container file.
Code:
mkfs -t xfs /dev/loop0
4) Mount the loop device to some mount point.
Code:
mount -o sync /dev/loop0 /mnt/backup
5) Create your tar file in /mnt/backup. <See posts above.>
6) Unmount your container file.
Code:
sync
umount /mnt/backup
losetup -d /dev/loop0
7) Now use whatever method pleases you to copy the /var/backup/container.file to some medium.
I've heard that dm-crypt is better than crypto-loop. I just haven't started to use dm-crypt yet. I'm not sure that dm-crypt would apply to a container file anyway but I've heard that I should be using it for my USB disks.
Edited to change reiserfs to xfs in step 3. I've had a lot of trouble with reiserfs lately.
Last edited by stress_junkie; 10-03-2006 at 09:46 AM.
i did read the man page, the verify option, like checksums does just that, verifies that the data is correct, but that is not what i want, i expect data on a CD to get corrupted over time, it sucks and i have had it happen many many times, but unfortunately verify will just tell me what i probably already know, that its bad
what i am asking for is something that can correct the error, just like RAID, knowing that a drive is broken is nice but its not much help if you still can't get the data, RAID 5 adds parity and thus goes beyond the "this drive is broke" and fixes it to get my files, instead of saying "its broke so you lose" like RAID 0 does
dvdisaster looks like what i want but i was kinda hoping for something that is installed on just about every distro so i don't have to go searching for the program to read it later
Create the backup. Let's use the following conditions. We are going to create a tar archive file on a disk partition mounted on /mnt/backup. The tar file will be named test.tar.
Code:
root> tar -c --verify -vf /mnt/backup/test.tar .
Okay so I just executed that and it worked. The tar file was created and it was verified. Now let's create an MD5 checksum for this known good file...........
.
Thanks for this excellent example. When I use the --verify it can't find the files because leading / is removed. How can I solve this?
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.