Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
This morning I decided to upgrade my workstation to Ubuntu 12.04.3 LTS. I made a backup of my home directory like so:
Code:
tar -xvzf /media/DATA/bak.tgz /home/sneakyimp
NOTE: /media/DATA is a separate hard drive for backups and other DATA. It is formatted exFat because my workstation does double-duty as a windows machine for audio recording.
I successfully upgraded but now I cannot seem to extract the tgz file. I created a subdir on the new system (/home/sneakyimp/bak) and tried to extract the file there but for some reason I could not extract. I don't recall the exact error.
So I copied the file from /media/DATA to /home/sneakyimp/bak and then tried to extract it there:
Code:
cp /media/DATA/bak.tgz /home/jaith/bak
cd /home/jaith/bak
tar -xvzf bak.tgz
It extracts files for a good long time but ultimately halts with an error:
Code:
gzip: stdin: invalid compressed data--format violated
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
This appears to happen when tar tried to extract a large tgz file within the archive.
I really hope I can get this file to extract properly. Can anyone recommend a way to get my data back?
This command does not create a tar file, it extracts files from it:
Code:
tar -xvzf /media/DATA/bak.tgz /home/sneakyimp
What you wanted to do was
Code:
tar cvzf bak.tgz
You can check the table of contents of the tar file with
Code:
tar tvf bak.tgz
If it can read the table, you may be able to exclude the broken file and extract the rest. You could also try on a Windows system, which will probably have better exFAT support.
I mistyped my tar creation command. It did in fact create a tgz file of 62GB:
Code:
tar -cvzf /media/DATA/bak.tgz /home/sneakyimp
I've been trying to inspect the contents of the file using Ubuntu's ArchiveManager (a GUI front-end for various compressed file formats) and the file tends to crash the program. I'm not sure if this is because of the size of the file or possible corruption in it.
I have just tried a different approach -- extracting the tgz archive on the exFat disk instead:
Code:
cd /media/DATA
tar -xvzf bak.tgz
This appears to be making more progress than my prior attempts for some reason. I suspect it has something to do with how tar deals with file systems. If I'm compressing the archive to an exFat file system, I should be prepared to extract it to an exFat file system -- or something like that.
I mistyped my tar creation command. It did in fact create a tgz file of 62GB:
Code:
tar -cvzf /media/DATA/bak.tgz /home/sneakyimp
I've been trying to inspect the contents of the file using Ubuntu's ArchiveManager (a GUI front-end for various compressed file formats) and the file tends to crash the program. I'm not sure if this is because of the size of the file or possible corruption in it.
I have just tried a different approach -- extracting the tgz archive on the exFat disk instead:
Code:
cd /media/DATA
tar -xvzf bak.tgz
This appears to be making more progress than my prior attempts for some reason. I suspect it has something to do with how tar deals with file systems. If I'm compressing the archive to an exFat file system, I should be prepared to extract it to an exFat file system -- or something like that.
Nope.
The restore depends on the file being valid. If the tar file is on an exFat filesystem, then it is up to the exFat filesystem to ensure that data isn't damaged.
I don't trust the USB interface to provide {sufficient,consistent} power for long intense operations. I prefer mains power adaptors, or where two USB cables are used, connect the power one to a second (quiet) machine.
Of course there could be all sorts of mismatches between writing/buffering/flushing ... but you'd hope not. Lots of people use USB all the time ... :shrug:
I am not using a USB drive. The secondary drive is an internally installed SATA drive.
Also, executing the tar command such that the archive is extracted to the secondary (exFat) hard drive works. I seriously believe there's either something about tar trying to work between filesystems or there's something up with the file system when it compresses between file systems.
I am not using a USB drive. The secondary drive is an internally installed SATA drive.
Also, executing the tar command such that the archive is extracted to the secondary (exFat) hard drive works. I seriously believe there's either something about tar trying to work between filesystems or there's something up with the file system when it compresses between file systems.
Nope.
If anything, using the same filesystem might avoid some problems due to buffering overhead that slows things down.
Tar works fine between filesystems. Used it for years that way. As long as the data isn't corrupted somewhere.
If anything, using the same filesystem might avoid some problems due to buffering overhead that slows things down.
Tar works fine between filesystems. Used it for years that way. As long as the data isn't corrupted somewhere.
While your conviction is impressive, it's somewhat less convincing than the fact that I attempted to extract my archive from the exFat drive to the ext4 drive repeatedly and it failed. I tried extracting the archive from the exFat drive to itself and it worked the first time. My files are there in all their glory.
I am not using a USB drive. The secondary drive is an internally installed SATA drive.
Ok, good.
Quote:
Also, executing the tar command such that the archive is extracted to the secondary (exFat) hard drive works.
I didn't get that it completed successfully from your post above.
The filesystem shouldn't matter - but I only use Linux native filesystems. And NTFS when I must (exFat is also proprietary M$oft). I recently added the exFat driver to my Fedora system but only for reading USB keys, so filesize is generally below 16 Gig.
Have you thought about trying NTFS ? - the support is much more mature on Linux.
If your drive had plenty of space, then why did you make a tarball anyway? I only make tarballs when I need to transfer the files elsewhere (eg: email/ftp), or when I absolutely need the compression. You clearly didn't need the compression, and you weren't transferring the file anywhere, so there was no need. I would be quite weary of a single 62 GB file regardless of the filesystem, but ESPECIALLY on FAT.
While your conviction is impressive, it's somewhat less convincing than the fact that I attempted to extract my archive from the exFat drive to the ext4 drive repeatedly and it failed. I tried extracting the archive from the exFat drive to itself and it worked the first time. My files are there in all their glory.
And I have been using it for about 5 years doing the same between filesystems, and using tar in general for about 20 years. Both backup and restore of entire systems (several hundred GBs each)
It isn't tar that has the problem.
Tar itself hasn't changed significantly in about 15 years (the last change I can think of was saving/restoring SELinux labels). The exact same code is used no matter what your filesystems are, or even whether it comes from and goes back to the same filesystem, or different filesystems. It isn't like dump which bypasses the filesystem to read directly from the disk. It always uses the kernel filesystems to open files/create files/read/write data...
For tar to fail requires the kernel to fail (buggy filesystems, buggy drivers, buggy memory management), the device to fail (blocks do go bad, controllers sometimes fail), or memory to fail (and memory can fail under load).
Your described problem puts double load on a single filesystem and device when it works - which slows things down. That would make me lean toward a possible memory problem or disk, but nothing definite. Ext4 has had reliable (not perfect) service for several years without issues. I don't use the Fat/exfat/NTFS filesystems as they have too many issues with fragmentation, and don't support security properly.
Indeed. In this case it isn't even tar but the pipelined gzip process that is reading the file and encountering the error. That gzip process has no notion of where tar is writing its output, so this suggests a memory, driver, or device problem.
What happens if tar isn't writing anywhere?
Code:
tar -tzf /media/DATA/bak.tgz nadanada
Does that run to completion with just a complaint that "nadanada" was not found in the archive? Can you successfully extract just a portion of your home directory, perhaps one or two subdirectories, to the ext4 filesystem?
He said it was formatted exFAT, for which the file size limit is 16 Exabytes (1024^3 GB), actually well beyond the maximum recommended volume size of 512 Terabytes.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.