LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - General (http://www.linuxquestions.org/questions/linux-general-1/)
-   -   A single regex to match anything with ".aac" or ".mp3" at the end ? (http://www.linuxquestions.org/questions/linux-general-1/a-single-regex-to-match-anything-with-aac-or-mp3-at-the-end-640862/)

lumix 05-08-2008 01:47 PM

A single regex to match anything with ".aac" or ".mp3" at the end ?
 
Tried ".*/.[\baac\b\bmp3\b]$" but no go. Tried many variations of this?

Any ideas?

MensaWater 05-08-2008 02:05 PM

Depends on what you're using.

For egrep it is easy so piping into egrep might be the way to go:

<whatever> |egrep "aac$|mp3$"

The $ tells it to look for it at the end of the line.

lumix 05-08-2008 02:08 PM

Needs to be a posix regexp
 
I won't bore you with the details, but my solution has to use "find -regex".

Thanks for the tip though...didn't know grep could do that.

ErV 05-08-2008 02:09 PM

Quote:

Originally Posted by lumix (Post 3147190)
Tried ".*/.[\baac\b\bmp3\b]$" but no go. Tried many variations of this?

Any ideas?

By using that regex you are asking machine for a string that consits any number characters, followed by a single character from a set "ac\bmp3" and has newline at the end.
By the way, where do you want to use regex?

You can use extended regular expressions:
Code:

ls|egrep "^.*\.(aac|mp3)$"
ls|grep "^.*\.\(aac\|mp3\)$"

or (ugly way, useful only if regexps with brackets are not supported)
Code:

ls |grep "^.*\.[acmp3]\{3\}$"
notice, that ^ and $ are not really necessary.

MensaWater 05-08-2008 02:22 PM

find . -name "*[am][ap][c3]"

This looks for any file that ends with a or m in 3rd from last character, a or p in 2nd from last and c or 3 in last. It will match the files you want but might match other oddities if they are there (e.g. aa3, mpc, apc...)

colucix 05-08-2008 02:38 PM

Why not simply...?
Code:

find . -name \*.mp3 -o -name \*.aac

lumix 05-08-2008 03:00 PM

Quote:

Originally Posted by colucix (Post 3147249)
Why not simply...?
Code:

find . -name \*.mp3 -o -name \*.aac

Again, the sordid details. Add to these details that I'd also like very much to learn more about the power of regular expressions.

My regex is in a php script, so while the other solution is very cool and creative "[am][ap][c3]", I need to be able to insert new file extensions by variable into my regex string. In other words, I need to be able to insert "avi" into this regex pattern string, and then use it in my find command to look for new file types. I'm pretty sure this is possible, and for various reasons I must use find and regexes.

Thanks for the response so far, btw.

lumix 05-08-2008 03:20 PM

An answer: but why the need to escape so many chars?

The following is a fairly concise and easy to change solution. But why do I have to escape the so-called "round-brackets"?

me@localhost:/AV/Sea Change$ find -regex ".*\.\(aac\|mp3\)"


It very nicely ignores, for example, ./testaac or mp3song.txt.

MensaWater 05-08-2008 03:48 PM

Because different tools (including the shell) interpret the characters seen. The shell treats parentheses as "grouping" of commands and the vertical bar as a "pipe". It would therefore think you were attempting to pipe command aac into command mp3 and would attempt to do that BEFORE the rest of the line due to the grouping. Escaping and quoting is one of the most maddening things you'll deal with in scripting.

There is a command line I've used that actually quotes escapes and escapes quotes to work correctly. It seems to be nonsensical to do that but on a command line where you're in a shell and piping things into awk and/or grep it sometimes is necessary to do things like this to be sure one thing treats it literally and passes it on that way to something else that you want to interpret it.

ErV 05-09-2008 02:11 AM

Quote:

Originally Posted by lumix (Post 3147282)
...
But why do I have to escape the so-called "round-brackets"?
...

To point out that they are special symbols, not the characters that must exist in the string.
There is a regex tutorial, you might want to read it: http://www.grymoire.com/Unix/Regular.html.


All times are GMT -5. The time now is 05:50 PM.