LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   awk:searching for a pattern and remove everything before it (https://www.linuxquestions.org/questions/linux-newbie-8/awk-searching-for-a-pattern-and-remove-everything-before-it-4175413372/)

lcvs 06-25-2012 10:32 PM

awk:searching for a pattern and remove everything before it
 
Hi all !

I am trying with awk to search for lines matching one of 2 patterns in the 2nd field, and returns only the lines with the pattern + everything after it in this field (= removing everything before the pattern in the field).

input (pattern 1 in red, pattern 2 in blue):
Code:

AAA|bbbbbZXCVjhkjhkjhk|DDDDDDDD
AAAAAA|jtyfytthyhyRTYUewertyu|OOOOOOOOO

output:
Code:

AAA|ZXCVjhkjhkjhk|DDDDDDDD
AAAAAA|RTYUewertyu|OOOOOOOOO


I was thinking using 2 gensub commands for each pattern:
Code:

gensub(/(.*)(pattern1)(.*)/,"\\2\\3","g",$2)
gensub(/(.*)(pattern2)(.*)/,"\\2\\3","g",$2)

But is there a way to write it only with one command?

Thanks for your help !

Jebram 06-26-2012 02:15 AM

Try regexp group alternatives:
Code:

gensub(/(.*)(pattern1|pattern2)(.*)/,"\\2\\3","g",$2)

lcvs 06-26-2012 09:16 PM

Thanks Jebram !

Before I tried with "||" instead of "|", it works now !

lcvs 06-27-2012 01:38 AM

Actually, it doesn't work...

It takes only pattern2 but doesn't change anything when there is pattern1 !

output:
Code:

AAA|bbbbbZXCVjhkjhkjhk|DDDDDDDD
AAAAAA|RTYUewertyu|OOOOOOOOO


And when I write the patterns between braces:
Code:

awk 'BEGIN{FS=OFS"|"} {gensub(/^(.*)((pattern1)|(pattern2))(.*)$/,"\\2\\3","g",$2)}1' input
It doesn't change anything when it matches pattern1, and only conserve pattern2 when it matches pattern2:
Code:


       
Code:

       
AAA|bbbbbZXCVjhkjhkjhk|DDDDDDDD
AAAAAA|RTYU|OOOOOOOOO



I don't know if it is a problem of syntax or if this command can handle several patterns at once !!!!

lcvs 06-27-2012 02:20 AM

I also tried with a regex for the 2 patterns:

Code:

awk 'BEGIN{FS=OFS"|"} {gensub(/^(.*)([ZR][XT][CY][VU])(.*)$/,"\\2\\3","g",$2)}1' input
Surprisingly, it works only when the field matches pattern2 (and returns the entire string when it matches pattern1).

Help would be welcome... :-)

grail 06-27-2012 03:16 AM

How about:
Code:

awk 'BEGIN{OFS=FS="|"}match($2,/(ZXCV|RTYU)/,f){split($2,a,f[1]);sub(a[1],"",$2);$1=$1;print}' file


All times are GMT -5. The time now is 12:17 PM.