LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > General
User Name
Password
General This forum is for non-technical general discussion which can include both Linux and non-Linux topics. Have fun!

Notices


Reply
  Search this Thread
Old 11-19-2008, 07:52 AM   #1
grob115
Member
 
Registered: Oct 2005
Posts: 540

Rep: Reputation: 32
Regex to exclude a specific phrase


Hello,

I need to create a regular expression to match a specific substring but ignore another substring. For example, I want to match a sentence containing the word "myMessage" as long as the sentence doesn't have the word "ignored". Here is an example sentence:
1) Dump: myMessage is to be picked
2) Dump: myMessage is to be ignored

In this case, 1) should be matched and 2) should be ignored. Is this possible with regular expression?
 
Old 11-19-2008, 08:33 AM   #2
w3bd3vil
Senior Member
 
Registered: Jun 2006
Location: Hyderabad, India
Distribution: Fedora
Posts: 1,191

Rep: Reputation: 49
cat file | grep -v ignored | grep myMessage ?
 
Old 11-19-2008, 09:51 AM   #3
rizwanrafique
Member
 
Registered: Jul 2006
Distribution: Debian, Ubuntu, openSUSE, CentOS
Posts: 147

Rep: Reputation: 19
regular expressions are tricky when it comes to conditions and iterations. Generally w3bd3vil's solution works unless you're trying to do something as part of a program. In a program (I think) you need to split it in two statements.
 
Old 11-19-2008, 10:18 AM   #4
ErV
Senior Member
 
Registered: Mar 2007
Location: Russia
Distribution: Slackware 12.2
Posts: 1,202
Blog Entries: 3

Rep: Reputation: 62
Cool

Quote:
Originally Posted by grob115 View Post
Hello,

I need to create a regular expression to match a specific substring but ignore another substring. For example, I want to match a sentence containing the word "myMessage" as long as the sentence doesn't have the word "ignored". Here is an example sentence:
1) Dump: myMessage is to be picked
2) Dump: myMessage is to be ignored

In this case, 1) should be matched and 2) should be ignored. Is this possible with regular expression?
You could do it ugly way:
Code:
cat test.txt |egrep 'myMessage.*(([^i][^g][^n][^o][^r][^e][^d])|picked)$'
But please notice that this regexp assumes that "picked" is at the end of line and it is not elegant solution. Also notice that "[^i][^g][^n][^o][^r][^e][^d]" can't be used alone (without "|picked)" part) to filter out "ignored" words, because it will also remove words like "index", etc. - i.e. those that have at least one letter in common with "ignored".

The less-uggly way will be:
Code:
cat test.txt |egrep 'myMessage.*(([^ignored]{7})|picked)$'
But it will remove any 7-letter word that contains letters from "ignored", and it doesn't precisely remove "ignored".

It looks like grep doesn't have elegant NOT operation when it comes words. You can exclude letters from certain character set, you can search for word OR another word ( egrep 'myMessage.*(picked|ignored)"will match lines with picked or ignored, but nothing else), but there is no operator for excluding word or pattern.

Last edited by ErV; 11-19-2008 at 10:36 AM.
 
Old 11-19-2008, 12:15 PM   #5
grob115
Member
 
Registered: Oct 2005
Posts: 540

Original Poster
Rep: Reputation: 32
Hello,
Thanks w3bd3vil. Grep actually wouldn't work because this regex is supposed to into another program and is not meant to run in the command.

As for the following, I actually do need a few words (ie a substring) to be matched exactly the way it is. So I guess the carat operator on individual characters, or the character set, won't work.
[QUOTE=ErV;3347578]
Code:
cat test.txt |egrep 'myMessage.*(([^i][^g][^n][^o][^r][^e][^d])|picked)$'
Code:
cat test.txt |egrep 'myMessage.*(([^ignored]{7})|picked)$'
I guess I should have been a bit clearer:
1) Dump: myMessage is ....
2) Dump: myMessage is to be not picked as it's not understood

where "...." can be anything but "to be not picked as it's not understood". Something along this direction. Any way this can be done using regex? Thanks.
 
Old 11-19-2008, 01:10 PM   #6
Disillusionist
Senior Member
 
Registered: Aug 2004
Location: England
Distribution: Ubuntu
Posts: 1,039

Rep: Reputation: 97
Code:
awk '{ if (!/not understood/) {if (/Dump:/) {print} } }' test.txt
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
No login phrase is coming now.. the.reverser Linux - Newbie 1 07-18-2008 12:25 PM
How to exclude specific directories from an rsync backup kaplan71 Linux - Software 2 05-16-2008 01:09 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 06:10 AM
tar --exclude --exclude-from cefn Linux - Software 4 10-11-2005 08:31 PM
grep a phrase degraffenried13 Linux - General 1 04-04-2004 12:10 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > General

All times are GMT -5. The time now is 07:27 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration