LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Bash script question, searching in a variable, displaying search results... (https://www.linuxquestions.org/questions/programming-9/bash-script-question-searching-in-a-variable-displaying-search-results-877076/)

Wolfjurgen 04-25-2011 11:25 PM

Bash script question, searching in a variable, displaying search results...
 
Hey Guys,

I'm trying to figure out how I can search within a variable and assign the results to a new variable. I'll use the following as an example -

cars="Audi BMW Cadillac Chevy Dodge Ferrari Ford Mercedes"
list=`echo ${cars} | egrep -o '\<A?+|\<C+'`

with the echo command I get the following output assigned to list -
A
C
C

What I'd like to get for output is -
Audi
Cadillac
Chevy

Anyone have an idea how I can accomplish this? Also, any tips on how I could do this regardless of upper/lower case letters?

David the H. 04-26-2011 12:51 AM

Your bash syntax is just fine, although $(..) is recommended over `..`.

The problem is with your grep regex. It's not matching what you want it to. "C+", for example, matches a variable-length string of C's, and only C's.

Code:


cars="Audi BMW Cadillac Chevy Dodge Ferrari Ford Mercedes"
list=$( echo ${cars} | egrep -io '\<[AC][^[:space:]]+[[:space:]]' )

Notice first of all how you can combine everything into a single expression. This one matches a word-beginning space, followed by an A or C, followed by one or more non-space characters, followed by a space.

[^x]+x (match anything not-x until you match an x) is a common pattern used for getting around regex greediness, as * and + will keep going until they find the longest possible match of the expression they effect. If you simply use .* or .+, it will match everything to the end of the text.

I also used the [:space:] character class, which matches space, tab, newline, and a few other rarer characters (read info grep), but if you know what the input will have, you can use those directly instead.

For case insensitivity, use grep's -i option.

Wolfjurgen 04-26-2011 09:22 AM

David the H. that is exactly what I was looking for!

I've got my script doing exactly what I want now. You sir are a scholar and a gentleman. Also, thanks for the BashFAQ link, good stuff.

Wolfjurgen 04-26-2011 12:04 PM

For anyone else who may find this useful, I had to tweak the egrep command to find the last matching item in a variable.

Consider the following -

~$ cars="Acura Audi BMW Buick Cadillac Chevrolet Chrysler Dodge Saab Saturn Subaru Suzuki"

I'll search for anything that starts with the letters "a", "d" or "s" in $cars-
~$ echo ${cars} | egrep -io '\<[ads][^[:space:]]+[[:space:]]'
Acura
Audi
Dodge
Saab
Saturn
Subaru

NOTICE how "Suzuki" is not listed in the output. In order to get the last matching item in the variable, do -

~$ echo ${cars} | egrep -io '\<[ads][^[:space:]]+[[:space:]]*'
Acura
Audi
Dodge
Saab
Saturn
Subaru
Suzuki

All I had to do was add an "*" at the end of the egrep command. Many thanks to David the H. once again!

MTK358 04-26-2011 12:11 PM

Actually, what's the point of the last "[[:space:]]"? The "[^[:space:]]+" part will not match beyond the end of the word anyway.

Wolfjurgen 04-26-2011 12:29 PM

Oh, cool. Thanks for pointing that out MTK358.


All times are GMT -5. The time now is 02:23 PM.