Script for unpacking archives of any sort?

Whiskerz · 08-30-2006, 04:28 AM

Hi All.

Recently I've been coming across a wide variety of different archive files, covering tar rar zip bk2 gz and the likes. And the only thing I ever wanted was to completely unzip the archives, maybe into a different directory than the current one, but that was all options I needed. Yet I found myself looking up all the man pages again and again and again because I couldn't remember the options and switches every unpack program uses.

So now the question : does anyone of you know a script which, taking the file extension, can automatically determine how to unzip a given archive (and maybe supports an additional parameter for a different destination directory)?
I've not done any Linux scripting up to now, so I wouldn't even really know where to start if I were to do something like that myself, plus since I could imagine that a basic "unpack" script like that would be rather handy I thought the chances are good, that somebody already has something working like that

Cheers

Whizz

blackhole54 · 08-30-2006, 05:46 AM

Quote:

Originally Posted by Whiskerz

I've not done any Linux scripting up to now, so I wouldn't even really know where to start if I were to do something like that myself, plus since I could imagine that a basic "unpack" script like that would be rather handy I thought the chances are good, that somebody already has something working like that

I don't know of any such script, but it would not be hard to write one. (You always wanted to learn scripting, didn't you?

)

Below is an outline of how such a bash script might look (with most of the details waiting to be filled in). Consult the bash man page for more info. Particularly about the case statement.

Code:

#!/bin/sh

#   Script to unpack a variety of archive files
#   The first parameter is the name of the  archive file  (bash refers to this as "$1")
#   The second parameter ($2) is the directory to unpack the archive into.
#   $# is the number of paramters on the command line.

if [ $# -ne 2 ]; then
   echo "Wrong number of parameters!"
   exit 1
fi

#  note that each branch of the case statement must be ended with two consecutive semicolons

case $1 in
   *.gz)
             cp -p $1 $2
             cd $2
             gunzip $(basename $1)
             ;;

   *.zip)

#   put command(s) for zip files here        
   
             ;;

#  add other branches as desired

         *)   echo "This script does not know how to unpack file $1"
                ;;
esac

Fill in the details for each type of file you want, flavoring it with any special sauce you desire. If you want the script to accept more parameters, addition ones would be referred to in the script as $3, $4, etc., up to $9.

Whiskerz · 09-07-2006, 06:47 AM

Quote:

Originally Posted by blackhole54

I don't know of any such script, but it would not be hard to write one. (You always wanted to learn scripting, didn't you?

)

*g got me, of course I always wanted to learn scripting but never had the time ^.^

Quote:

Originally Posted by blackhole54

Below is an outline of how such a bash script might look (with most of the details waiting to be filled in).

Well quite a challange you gave me there. But after a bit of reading howtos and man pages, and asking some more linux savvy friends here is what I came up with :

Code:

#!/bin/sh

#  Script to unpack a variety of archive files
#  The first parameter is the name of the  archive file  (bash refers to this as "$1")
#  The second parameter ($2) is the directory to unpack the archive into.
#  $# is the number of paramters on the command line.

if [ $# -ne 2 ]; then
   echo "Wrong number of parameters!"
   exit 1
fi

#  This is the same for all programs : copy to destination directory and change dir to go there 
if [ ! -e $2 ]; then
   echo "Destination directory doesn't exist! Creating now ..."
   mkdir -p $2
fi

# Common for all branches
srcFile=$PWD"/"$1
cd $2


#  Note that each branch of the case statement must be ended with two consecutive semicolons
case $1 in
   *.tgz | *.tar.gz)
#  Put command(s) for tar+gzip files here
             tar -xzvf $srcFile
             ;;

   *.tbz2 | *.tar.bz2 | *.tbz | *.tar.bz)
#  Put command(s) for tar+bzip files here
             tar -xjvf $srcFile
             ;;

   *.tar)
#  Put command(s) for tar files here
             tar -xvf $srcFile
             ;;

   *.z | *.gz)
#  Put command(s) for gz files here
             gunzip $srcFile
             ;;

   *.zip)
#  Put command(s) for zip files here
             unzip $srcFile
             ;;

   *.rar)
#  Put command(s) for rar files here
             unrar x $srcFile
             ;;

   *.bz | *.bz2)
#  Put command(s) for rar files here
             bunzip2 $srcFile
             ;;

   *)
#  Catch the rest and print error 
             echo "This script does not know how to unpack file $1"
             ;;
esac

echo "FIN"

Tested it and it seems to work fine! If anyone should see an immediate error (I confess that I didn't test very thoroughly ^.^) I'd be happy if you could point it out!

Thanks to blackhole54 for the outline and that little push thats always needed to get started ^.^

Cheers

Whizz

blackhole54 · 09-07-2006, 10:00 AM

Congratulations! I was afraid I had scared you away forever!

I think you will find that even a little knowledge of bash scripting will improve your computing experience. Scripts, both simple and complex, are quite handy, as your script shows.

I am not sure you did what you wanted with gunzip. This command works a little differently than the other unpacking commands. By default, gunzip will replace the original file with the unzipped version. So using your script as it is currently written, the original file will disappear, and the new file will appear in the original directory while the new directory (if it was newly created) will remain empty. If you want it to work like the other commands, you could first copy the file to the new directory and then(using the basename command) unzip it, or you could use the -c option for gunzip and redirect its output (using > ) to the new file.

Also, the way you create the srcFile variable won't work correctly if the supplied filename uses an absolute path (one beginning with a slash). You could instead:

Code:

case $1 in
   /*)  srcFile=$1
        ;;
    *)  srcFile=$PWD/$1
        ;;
esac

But overall, it looks like a great first script. Glad I could give you a nudge!

Whiskerz · 09-08-2006, 05:18 AM

Thanks again for your advice! I used your case expression just to be safe in case of absolute paths.

For gunzip, I opted for the "-c" option, however I was wondering how to get the filename I want to save to. My only idea was to expand the case option

Code:

*.z | *.gz)

so I could be sure what the file extension was and remove it from the original filename to give the unzipped filename (see below). I tried using a regular expression for substringing like (\.z|\.gz) but I didn't get it to work. Does anyone know of a more general way which doesn't require double code?

Code:

   *.z)
#  Put command(s) for gz files here
             base=$(basename $1)
             gunzip -c $srcFile > ${base%.z}
             ;;

   *.gz)
#  Put command(s) for gz files here
             base=$(basename $1)
             gunzip -c $srcFile > ${base%.gz}
             ;;

makyo · 09-08-2006, 08:47 AM

Hi.

Well done; this was one of the best collaborations I have seen. Good guidance, good experimentation.

In the part:

Quote:

I tried using a regular expression for substringing like (\.z|\.gz)

The individual parts of the case selector are not REs, but are pathname expressions (filename globbing). The syntax with "|" makes the entire expression look like an RE, but it's not. One way I remember this is that the shell is processing the case, so it's more likely to be a glob. It is an interesting application, since the strings don't actually need to be filenames, but they are matched as if they are.

Code:

... A case command first expands word, and tries to match it against
              each pattern in turn, using the same matching rules as for path-
              name  expansion (see Pathname Expansion below). -- excerpt from man bash

If this is just for yourself, then you know what's going on. If it's more public than that, I think I might add the use of file to identify the type of the file -- but then, I've been known to be paranoid

.

Finally, I'm not certain if this will address the combined case question you posed, but it may be useful. Note that these expressions get very dense, and it's difficult to recall who is doing what to whom. I often refer to the O'Reilly book on bash to remind myself, although there might be enough in man bash to help.

Again -- it was a pleasure to read this thread ... cheers, makyo

Code:

#!/bin/sh

# @(#) s1       Demonstrate case alternation.

SET="a.z b.gz c.GZ d.x /long/path/name/e.y"

for i in $SET
do
        case $i in
                *.z | *.gz | *.GZ | *.x |*.y )
                echo ; echo "old = :$i:" ;
                name=${i##*/} ; echo "name = :$name:" ;
                base=${name%.*} ; echo "base = :$base:" ;
                ext=${name##*.} ; echo "ext = :$ext:" ;
                echo "new = :new.$ext:" ;;
        esac
done

Which results in:

Code:

% ./s1

old = :a.z:
name = :a.z:
base = :a:
ext = :z:
new = :new.z:

old = :b.gz:
name = :b.gz:
base = :b:
ext = :gz:
new = :new.gz:

old = :c.GZ:
name = :c.GZ:
base = :c:
ext = :GZ:
new = :new.GZ:

old = :d.x:
name = :d.x:
base = :d:
ext = :x:
new = :new.x:

old = :/long/path/name/e.y:
name = :e.y:
base = :e:
ext = :y:
new = :new.y

blackhole54 · 09-08-2006, 12:25 PM

Quote:

Originally Posted by Whiskerz

Does anyone know of a more general way which doesn't require double code?

I believe this will do it. Make sure you test it!

Code:

   *.z | *.gz)
             base=$(basename $1)
             gunzip -c $srcFile > ${base%.*z}
             ;;

For more complicated situations where the extensions are not so similar, you could strip off an arbitrary extension using the sed command:

Code:

stripped_name=$(echo $(basename $1) | sed "s/\.[^.]*$//")

This gets you into the realm of real regular expressions. They are quite powerful, but take a while to get used to. (At least it took me a while.)

@makyo: The above is in the original spirit of the script, which was to use extensions. The file program could, of course, be used, but then you have much more complicated decisions to make.

EDIT: Silly me ... Always trying to make things too difficult. An arbitrary extension could be handled:

Code:

${base%.*}

blackhole54 · 09-08-2006, 12:51 PM

Quote:

Originally Posted by makyo

The individual parts of the case selector are not REs, but are pathname expressions (filename globbing). The syntax with "|" makes the entire expression look like an RE, but it's not. One way I remember this is that the shell is processing the case, so it's more likely to be a glob. It is an interesting application, since the strings don't actually need to be filenames, but they are matched as if they are.

The terminology does not seem to be that clearcut. Certainly these are not regular expressions in the sense that grep, sed, etc use the term. (And even they don't totally agree among themselves.) The bash man page calls these patterns, and they do seem to follow (at least more or less) the globing rules. But I have seen these bash constructs called regular expressions, including in the O'Reilly book Linux in a Nutshell. And you know those O'Reilly folk ain't no slouches! So personally, I don't get dogmatic about what they are called; I just try to communicate.

And oh, thanks for the kind words.

Whiskerz · 09-13-2006, 10:47 AM

Quote:

Originally Posted by blackhole54

I believe this will do it. Make sure you test it!

Code:

   *.z | *.gz)
             base=$(basename $1)
             gunzip -c $srcFile > ${base%.*z}
             ;;

*slaps his forehead* I must've been really tired ... but at least I was nearly there, just got mixed up a bit with the exact usage of "%" for cutting strings ^.^

Tested and works!

Thanks for your info and support, makyo! Regarding the case selectors however that was a bit of an error on my side : I didn't use the REs there (didn't know the exact syntax but I knew those weren't what you usually consider regular expressions), but I tried to extract a substring from my original filename using REs and the

Code:

expr "$string" : '.*\($substring\)'

construct I found in the bash scripting guide. However I didn't get it to do what I wanted and left it at that ^.^.

Thanks again for helping. If anyone knows useful additions feel free to give some hints :-P

*waves*

Whizz

pengu · 09-13-2006, 08:38 PM

a nice gui prog is Ark, i belive it comes with kde...

for gnome I think there is a program called file-roller