LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices



Reply
 
Search this Thread
Old 07-29-2011, 04:50 AM   #1
davee
Member
 
Registered: Oct 2002
Location: Ayrshire, Scotland
Distribution: Suse(home) RHEL (Work)
Posts: 263

Rep: Reputation: 30
Using grep or sed to return a regex match


Hi folks,

I'm working on a script that I want to search through a file for a regex match, and store each different match in an array. For instance, if I have a regex to search for an IP address, I would want to store each unique IP address found in the script to the array.

For this reason, I'm trying to find a way that, presumably with grep (or sed?), I can return the match found rather than the full line of text the line was found in.

Is this possible? I could strip the match from the line matched, but this would lead to extra complication with multiple matches on a single line. Am I missing something obvious?

Thanks in advance...
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 07-29-2011, 05:13 AM   #2
devnull10
Member
 
Registered: Jan 2010
Location: Lancashire
Distribution: Slackware Stable
Posts: 547

Rep: Reputation: 115Reputation: 115
Check whether your implementation of grep supports the "-o" flag.
 
Old 07-29-2011, 08:45 AM   #3
davee
Member
 
Registered: Oct 2002
Location: Ayrshire, Scotland
Distribution: Suse(home) RHEL (Work)
Posts: 263

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by devnull10 View Post
Check whether your implementation of grep supports the "-o" flag.
Unfortunately not. I need it to run on Linux and Solaris:

svr:user$ grep -o hello email.txt
grep: illegal option -- o
usage: grep [-[[AB] ]<num>] [-[CEFGVchilnqsvwx]] [-[ef]] <expr> [<files...>]
 
Old 07-29-2011, 08:53 AM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Using sed:
Code:
array=( $(sed 's/[^0-9]*\([0-9]\+\.[0-9]\+\.[0-9]\+.[0-9]\+\).*/\1/' file | uniq) )
Which shell are you using? The array assignment above and the command substitution syntax work in bash/ksh.
 
1 members found this post helpful.
Old 07-29-2011, 08:56 AM   #5
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Hi,

try this:
Code:
sed -nr "s/.*(PATTERN_TO_MATCH).*/\1/p" file
I used double quotes so that you can replace 'PATTERN_TO_MATCH' with a variable if you need it. Keep in mind, that you will have to escape the dots in an IP address, e.g.
1.2.3.4

must become
1\.2\.3\.4
 
Old 07-29-2011, 09:01 AM   #6
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
@crts: good catch for the -n /p option. Regarding the -r option, it's not available on Solaris' sed (if I remember well), so that we have to escape parentheses to let them work as intended. The only thing I cannot solve using this sed approach is the presence of multiple IP addresses on the same line (if any).

Last edited by colucix; 07-29-2011 at 09:02 AM.
 
Old 07-29-2011, 09:57 AM   #7
crts
Senior Member
 
Registered: Jan 2010
Posts: 1,604

Rep: Reputation: 446Reputation: 446Reputation: 446Reputation: 446Reputation: 446
Quote:
Originally Posted by colucix View Post
Regarding the -r option, it's not available on Solaris' sed
I did not know that. I also did not consider the possibility of multiple IP's on the same line. Let's say we have this file:
Code:
some junk 1.2.3.4 some more junk with numbers 2.3.4.5 eol
some junk 3.4.5.6 some more junk eol
some junk 4.5.6.7 with equal ip on same line 4.5.6.7 eol
some repetition 1.2.3.4 some more junk with numbers 2.3.4.5 eol
lots of ip 5.6.7.8 in 7.8.9.0 this 8.9.0.11 line 12.34.56.89 eol
(duplicate)lots of ip 5.6.7.8 in 7.8.9.0 this 8.9.0.11 line 12.34.56.89
With GNU sed we can handle it:
Code:
sed -rn 's/[^0-9]*(([0-9]+\.){3}[0-9]+)/\1\n/;T;P;D' file
# or without the -r option
sed -n 's/[^0-9]*\(\([0-9]\+\.\)\{3\}[0-9]\+\)/\1\n/;T;P;D' file
However, I am not sure about the sed capabilities on Solaris. So here is another solution with all GNU extensions disabled:
Code:
sed --posix -n 's/[^0-9]*\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\)/\1\n/;t a;b;:a P;D' file
I know, it's ugly. With the --posix option it wouldn't even accept the '+' quantifier.

So we finally get something like:
Code:
array=( $(sed --posix -n 's/[^0-9]*\([0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\.[0-9][0-9]*\)/\1\n/;t a;b;:a P;D' file | sort -u) )
 
2 members found this post helpful.
Old 08-02-2011, 03:48 AM   #8
davee
Member
 
Registered: Oct 2002
Location: Ayrshire, Scotland
Distribution: Suse(home) RHEL (Work)
Posts: 263

Original Poster
Rep: Reputation: 30
Quote:
Originally Posted by crts View Post
I know, it's ugly. With the --posix option it wouldn't even accept the '+' quantifier.
Welcome to my world!

Thanks for the response - very comprehensive. Script now working; I appreciate everyone's time on this.
Davee
 
  


Reply

Tags
bash, grep, regex


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] sed regex get multiple string match in array? ted_chou12 Programming 3 04-09-2011 04:16 AM
output the occurence number in sed or grep results in every regex match mbaste2 Linux - General 3 04-06-2011 02:58 AM
[SOLVED] Any grep, sed or awk gurus with regex familiarity? I need some help. bcrawl Linux - Newbie 19 01-19-2011 08:52 PM
Help with sed regex to match words via a pattern. logar0 Linux - Newbie 3 10-24-2010 05:33 PM
grep/sed/awk - find match, then match on next line gctaylor1 Programming 3 07-11-2007 09:55 AM


All times are GMT -5. The time now is 05:15 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration