LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   untaring in place? (https://www.linuxquestions.org/questions/linux-software-2/untaring-in-place-449382/)

shievelet 05-28-2006 02:43 PM

untaring in place?
 
i wasn't sure if this should go in general or in here.

to make a long story short, i have a HUGE tar file that takes up 2/3rds of a separate harddrive. the problem is, i need to untar it, but that hd is the only hd i have that can store that much data.

so my question is, is there a way to "untar in place?" as in, as the files are being untarred, incrementally remove the tar archive.

many thanks.

osor 05-28-2006 03:13 PM

If you are using GNU tar (most likely), you might try the --delete option.

shievelet 05-28-2006 03:46 PM

Code:

tar: You may not specify more than one `-Acdtrux' option
apparently i can't combine x(tract) with d(elete)

:cry:

i hope i don't have to go by another harddrive just to untar this

osor 05-28-2006 04:39 PM

Quote:

Originally Posted by shievelet
Code:

tar: You may not specify more than one `-Acdtrux' option
apparently i can't combine x(tract) with d(elete)

:cry:

i hope i don't have to go by another harddrive just to untar this

I see. First off, -d stands for --diff (not --delete). Moreover, even if you can't do both, you could tar in place one file at a time (i'm not sure, but I think that's what the program probably does internally anyway). Unless you have a very huge file inside the archive, you would be fine with

Code:

for file in $(tar tf myfile.tar | tac); do
        tar xf myfile.tar $file && tar f myfile.tar --delete $file || exit 1
done

The `tac' ensures that they go out in the reverse order they went in (so a directory is not deleted before its contents). If you are paranoid about losing your data, I would recommend looking at `tar tvf' and seeing which files are largest.

Edit: small change in loop body (added exit 1)

NOTE: I am not responsible for any data loss resulting from this post. If you are unsure about anything, post a reply and we'll try to help.

shievelet 05-28-2006 04:47 PM

brilliant, i was just working on a similar bash script now

after i scrutinize it, i'll give it a whirl and post the results

randyding 05-28-2006 05:03 PM

Hello,
Its good to be cautious when using "for file in" syntax...
"for file in $(tar tf myfile.tar | tac); do"
because if the tar file is that huge the $() will expand to one giant line that is much larger than bash can handle on a single "for file in".
This is going to be a Huge! operation and it would be undesirable for 1/3 of it to complete then have to start over when bash crashes on the long line.

Instead try streaming it and process one line of the output at a time...
"tar tf myfile.tar | tac | while read filename; do"
and this will not suffer from long line problems.

Best yet, put the tar file on one harddrive and untar it to another.
Either of these two methods will do this
1. cd /emptydrive ; tar xf /fulldrive/myfile.tar
2. tar /fulldrive/myfile.tar -C /emptydrive

shievelet 05-28-2006 05:05 PM

Quote:

Originally Posted by osor
Code:

for file in $(tar tf myfile.tar | tac); do
        tar xf myfile.tar $file && tar f myfile.tar --delete $file || exit 1
done


i did a test extracting just two files (and not deleting) but the tar process just hung. the files were relatively small (~130kb each), but the harddrive was going crazy like it was trying to untar everything. i killed the process after about 3 minutes and the files were fine...but if this happens in the script, it'll never get to the delete part. any ideas?

i don't want to run the script until i'm absolutely sure this will go off without a hitch, since it took me a long long time to obtain the archive.

randyding 05-28-2006 05:12 PM

Hmm, I've done what you're describing, I think there's a misunderstanding of what --delete does.
I believe --delete causes the tar file to be "copied" and while being copied the specified files are not included in the copy. Then the original file is unlinked when the copy is complete.

So you need 2x disk space to perform a --delete. It does not delete files from the archive in place while performing an extract, the way I understand it now.

shievelet 05-28-2006 05:15 PM

Quote:

Originally Posted by randyding
Hello,
Its good to be cautious when using "for file in" syntax...
"for file in $(tar tf myfile.tar | tac); do"
because if the tar file is that huge the $() will expand to one giant line that is much larger than bash can handle on a single "for file in".
This is going to be a Huge! operation and it would be undesirable for 1/3 of it to complete then have to start over when bash crashes on the long line.

Instead try streaming it and process one line of the output at a time...
"tar tf myfile.tar | tac | while read filename; do"
and this will not suffer from long line problems.

Best yet, put the tar file on one harddrive and untar it to another.
Either of these two methods will do this
1. cd /emptydrive ; tar xf /fulldrive/myfile.tar
2. tar /fulldrive/myfile.tar -C /emptydrive

thanks for the tip!

unfortunately, i don't have another harddrive to untar this to, hence my problem.

but, borrowing your "tar tf myfile.tar | tac | while read filename; do" method and osor's idea, the standing code is:

Code:

tar -tf myfile.tar | tac | while read filename
do
tar -xf myfile.tar $filename && tar -f myfile.tar --delete $filename || exit 1
done

good?

edit:

Quote:

Hmm, I've done what you're describing, I think there's a misunderstanding of what --delete does.
I believe --delete causes the tar file to be "copied" and while being copied the specified files are not included in the copy. Then the original file is unlinked when the copy is complete.
so the above snip won't work? :(

randyding 05-28-2006 05:20 PM

Oh gosh please don't do it that way. That will literally take weeks to complete and burn out hour HD if the tar file is gigabytes!
Every file will cause a copy/delete operation, each one taking a very long time to complete... and you don't have 3x space anyway.
You need 1x for the original tar file, another 1x for the copy when --delete is run, and final 1x for the extraction.
I think you need to install another hard drive, really.

shievelet 05-28-2006 05:26 PM

i'm submitting a feature request to gnu tar then :mad:

osor 05-28-2006 06:25 PM

This is the most interesting problem i've encountered in a long time (and I encounter many problems involving low resources)!

As a last resort, I started to think of all programs that know about the `tar' format. Obviously tar has been shown not to work. The cpio command doesn't even have a delete functionality. Then I remember a friend (more like mentor) of mine who used to use be able to edit sourcefiles of a tarball inside of emacs (in place). Being a vim user myself, I had no idea if it was just a frontend to the actual tar program. Cursory research shows that in the Emacs documentation, it says
Quote:

You don't need the tar program to use Tar mode
which I assume to mean it has its own way of looking at tar files. If you already know about this, excuse this post. If you don't, maybe this might solve the problem.

P.S.
Is there a time constraint? When is this `due'?

shievelet 05-28-2006 07:38 PM

Quote:

Originally Posted by osor
This is the most interesting problem i've encountered in a long time (and I encounter many problems involving low resources)!

yeah this is definitely an interesting problem. i imagined it'd be easy when i started, given all that linux is capable of :\

Quote:

Originally Posted by osor
As a last resort, I started to think of all programs that know about the `tar' format. Obviously tar has been shown not to work. The cpio command doesn't even have a delete functionality. Then I remember a friend (more like mentor) of mine who used to use be able to edit sourcefiles of a tarball inside of emacs (in place). Being a vim user myself, I had no idea if it was just a frontend to the actual tar program. Cursory research shows that in the Emacs documentation, it says

Quote:

You don't need the tar program to use Tar mode
which I assume to mean it has its own way of looking at tar files. If you already know about this, excuse this post. If you don't, maybe this might solve the problem.

yeah, i just did a little research and apparently the tar archive format isn't all that complicated. somehow i remember being able to mount a FILE as a filesystem, which could work for me, but i don't know how to do this.

Quote:

Originally Posted by osor
P.S.
Is there a time constraint? When is this `due'?

no official time constraint, but it is a pressing matter. if i don't find a solution within the next few days, i'll just have to buckle down and reluctantly buy another 100gb+ harddrive just for this one operation. and i'm not looking forward to that.

shievelet 05-28-2006 07:42 PM

also, i do have enough gigabytes free on TWO harddrives (together) to hold the contents of the file, but i have no idea how one would untar to location A until it overflows and then continue to untar in location B...some kind of hybrid symlink..? no clue

this is so frustrating lol

edit: for the record, this tar is 75gb

shievelet 05-29-2006 07:17 AM

ok basically all i did was untar it on 3 different harddrives by killing the tar process when i ran out of space and picking up again with the "-K startfile" option.

problem solved, sorta


All times are GMT -5. The time now is 06:47 PM.