LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How to get filename extension (https://www.linuxquestions.org/questions/programming-9/how-to-get-filename-extension-549486/)

makowka 04-27-2007 10:06 AM

How to get filename extension
 
Hi

I wrote a small and easy bash script, which just renames a file name to another. Here it is:

#!/bin/bash
echo "Original filename (without extension)"
read n
echo "New filename"
read m
f=`pwd`
a=`ls $f | grep $n`
b=`echo $a | sed ????????????`
mv $f/$a $f/$m.$b

The only problem is with sed, as I want to get the filenames extension. Sed is kind of sophisticated, could someone help me to get the extension from the filename (I don't need the filename, but its extension)?

Thanks in advance
Szymon

omnio 04-27-2007 10:22 AM

Code:

#!/bin/bash

file="archive.tar.gz"

echo "filename without extension is: ${file%%.*}"
echo "extension is: ${file#*.}"

http://www-128.ibm.com/developerwork...sh.html#N1010C

MensaWater 04-27-2007 10:25 AM

That would depend a lot on what you mean by extension. DOS/Windoze requires "etensions" such as .exe, .cmd, .bat, .jpg etc... but Linux/UNIX do not. You can add such "suffixes" but it is just another part of the file name. Also since the "." has no special meaning you could actually have multiples for example:
this.is.my.favorite.file

The end of the file is ".file" but that is not an extension.

Now you CAN get the end of the file but you need to be aware of the foregoing. If you KNEW that all you had to search through were files that only had one dot like standard DOS/Windoze files you could do:

b=`echo $a | awk -F. '{print $2}'`

Here you use awk rather than sed. The -F says to set the delimiter to dot (.) and the print statement says to print the second field which would be the one after the dot.

If you did that on the example above however instead of getting "file" as the "extension" you'd get "is" as the extension.

makowka 04-27-2007 01:38 PM

Thanks, jlightner, your method works! I have found somewhere in the Internet another approach to that:

Code:

b=`echo $a | sed -e 's/.*\.//'`
I dont't understand much of that, but it works, too.

Exactly, I need to treat the filename as "name.extension", although it is not Linux style at all (old windows habit). I just wanted to save the user (me) the need for typing the whole filename, with the "extension" (I always avoid to have multiple dots in the filename). In fact, if one would like to rename the file test.tar to some.tar, he could just type "te" in the first program invocation, and the script should find the corresponding file (thanks to grep, really a great thing) - on the condition, that in the current directory there aren't any other files with the phrase "te" in the filename.

Sed seems to be really complicated, could you tell me, what exactly is going on in the line above?

Thanks for your answers
Szymon

MensaWater 04-27-2007 02:43 PM

s/.*\.//'

s = substitute
first / = locate (search for)

. = regular expression (regexp) special character meaning "concatenate" (join together what came before and after it).

* = match all characters.

\ = "escape" the character that follows. Some characters have special meaning (. for example means "concatenate") in regexp as noted above so putting "\." means look literally for the character "." rather than trying to use it to concatenate.

So .*\. means look for any pattern up to and including the literal character "." (dot)

second / = replace with - This would be followed by what ever pattern you wanted to put in place of what you searched for.

third / = finish the replace with - since there was nothing after the second / before the third / you're essentially saying "replace all characters up to and including the dot with "nothing" which is a cute way of deleting them.

The above sytnax (s/pattern/pattern/) is the most common usage of sed. One thing not done there (because it wasn't necessary) is to add a g after the 3rd slash. That would say "global". If you were using sed to parse the text IN a file rather than just a list and that text had multiple occurrences of "pattern" on a line then it would only change the first one without the g.

taylor_venable 04-27-2007 02:49 PM

Quote:

Originally Posted by jlightner
. = regular expression (regexp) special character meaning "concatenate" (join together what came before and after it).

* = match all characters.

More accurately, full-stop (.) matches any single character, and star (*) matches zero or more occurrences of the previous element (in this case, it means zero or more occurrences of any single character).

makowka 04-30-2007 02:29 PM

Jlightner, thank you very much for the explanation of the sed-command, now it is sort of easier for me to understand, what was written. Writing more complicated programs is not for me, I think, but in opposition to Windows, Linux has the advantage of shipping bash and other environments within the core system, so one can write easy "applets" and useful programs (for example, with gdialog, very good thing).

Thanks again.

cfaj 04-30-2007 05:32 PM

Quote:

Originally Posted by makowka
Hi

I wrote a small and easy bash script, which just renames a file name to another. Here it is:

#!/bin/bash
echo "Original filename (without extension)"
read n
echo "New filename"
read m
f=`pwd`


There is no need to use command substitution (which is slow). All POSIX shells have the PWD parameter:

Code:

f=$PWD
(But even that is not needed; see below.)
Quote:

a=`ls $f | grep $n`

There are several problems with this line.

1. It will fail if the directory name, $f, contains whitespace or a character special to the shell (e.g., shell wildcard).
2. It will fail if the search string, $n, contains whitespace or a character special to the shell.
3. You are using two unnecessary external commands, which will slow your script considerably. (It's not bad for a single execution, but if you incorporate this into a larger script, especially if there are other such instances, or if it's in a loop, the delay can become noticeable.)
4. It will break your script if there's more than one file whose name contains the search string.
You should use a wildcard to generate the file name or names.

Quote:

b=`echo $a | sed ????????????`


The POSIX shell (e.g., bash), contains parameter expansions that can remove leading or trailing patterns from a variable:

Code:

suffix=${filename##*.} ## suffix only
name=${filename%.*}  ## filename without the suffix

Quote:

mv $f/$a $f/$m.$b

That line could fail for the same reasons mentioned above.

There's no point to using $f/ since you are only looking at files in the current directory.

The variables should be quoted:

Code:

mv "$a" "$m.$b"
Quote:

The only problem is with sed, as I want to get the filenames extension. Sed is kind of sophisticated, could someone help me to get the extension from the filename (I don't need the filename, but its extension)?


All times are GMT -5. The time now is 09:56 AM.