multiple pattern in one line

fs11 · 09-24-2010, 12:05 PM

Hello All,

I am trying to extract lines from a text file that have multiple patterns in it. The format for the file is below:

R.GPLPAAPPAAPERQPS*WER.S
R.GQS*PVSRET*APPVPAARAR.T
R.GQS*PVS*RETAPPVPAARAR.T
R.GQSPVS*RET*APPVPAARAR.T
K.GIPFAT*AKT*LENPQR.H
K.GLHVRAAS*VS*AKGM#SR.K

The output for the files should be the lines that have more than one *.
The problem is that * is also a wild character for sed and grep etc. and I am not able to get the correct expression. Any help would be appreciable.

Thanks in advance.

Guttorm · 09-24-2010, 12:17 PM

Put \ before * and other symbols when you use sed and grep etc. Then they behave like regular letters.

fs11 · 09-24-2010, 12:28 PM

I am trying to use this:

Quote:

sed '/\*/!d; /\*/!d filename

but it doesnt work

GrapefruiTgirl · 09-24-2010, 12:39 PM

I think your regex is not detailed enough, nor the sed command complete enough, in post #3, to do what you expect it to. Meanwhile, here's an example using grep:

Code:

sasha@reactor: cat asterisk.sh

R.GPLPAAPPAAPERQPS*WER.S
R.GQS*PVSRET*APPVPAARAR.T
R.GQS*PVS*RETAPPVPAARAR.T
R.GQSPVS*RET*APPVPAARAR.T
K.GIPFAT*AKT*LENPQR.H
K.GLHVRAAS*VS*AKGM#SR.K

sasha@reactor: grep -e '\*.*\*' asterisk.sh
R.GQS*PVSRET*APPVPAARAR.T
R.GQS*PVS*RETAPPVPAARAR.T
R.GQSPVS*RET*APPVPAARAR.T
K.GIPFAT*AKT*LENPQR.H
K.GLHVRAAS*VS*AKGM#SR.K
sasha@reactor:

Breakdown the regex: '\*.*\*'

The stuff inside the '' means to match: a literal * character followed by any character(s) followed by another literal * character. So broken into its parts, the regex means:
\* = a single * character (the * is escaped)
.* = any character(s)
\* = and again, another * character (again, escaped with the back-slash).

If you want the output to go to a new file, redirect it with a > character into a file.

fs11 · 09-24-2010, 01:28 PM

Thanks alot

kurumi · 09-24-2010, 07:24 PM

Code:

$ ruby -ne 'print if $_.count("*")>1' file
R.GQS*PVSRET*APPVPAARAR.T
R.GQS*PVS*RETAPPVPAARAR.T
R.GQSPVS*RET*APPVPAARAR.T
K.GIPFAT*AKT*LENPQR.H
K.GLHVRAAS*VS*AKGM#SR.K

GrapefruiTgirl · 09-24-2010, 07:41 PM

@ kurumi - I don't see too much in the way of ruby solutions normally, until you came along and began posting some.

I like it - the more ways we have to do stuff, the better! Thanks.

grail · 09-24-2010, 11:48 PM

Just as another alternative:

Code:

awk 'split($0,a,"*") > 2' file

kurumi · 09-25-2010, 12:03 AM

Quote:

Originally Posted by grail

Just as another alternative:

Code:

awk 'split($0,a,"*") > 2' file

why not let awk do the splitting?

Code:

awk -F"*" 'NF>2' file

grail · 09-25-2010, 12:35 AM

Touche