LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-11-2011, 09:44 AM   #1
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Rep: Reputation: 1
Smile Grep varying no. of lines between two patterns


I have a file that goes like this
>pattern 1
xyz
xyz
abc
asdfg
>pattern 2
xyz
>pattern 1
adbf
sfni
>pattern 2
bla bla
xyz

I need to grep the lines between pattern 1 and pattern 2 and not the lines following pattern 2. Cannot use grep -A(num), as there are varying number of lines following pattern 1. Also, used awk one-liners, but results are erroneous.

I'll be glad if someone comes up with a good one-liner for this
 
Old 04-11-2011, 10:03 AM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,825
Blog Entries: 1

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
Hi and welcome to LQ.

Try using SED to accomplish the task. If you are stuck at any point, feel free to post your code. We'll be happy to assist you.
 
Old 04-11-2011, 10:46 AM   #3
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443
Blog Entries: 3

Rep: Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721Reputation: 721
AWK code:

Code:
BEGIN {
    inside = 0;
}

/>pattern 1/ {
    inside = 1;
}

/>pattern 2/ {
    inside = 0;
}

/your pattern/ && inside {
    do stuff
}
 
Old 04-11-2011, 11:07 AM   #4
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,825
Blog Entries: 1

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
Ok, since we have started giving solutions, the sed one (if I understand the problem correctly) would be as follows:

Code:
sed -n '/<pattern 1/,/<pattern 2/p' infile
 
1 members found this post helpful.
Old 04-11-2011, 11:26 AM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
Assuming header and footer also not wanted:
Code:
awk '/>pattern 1/,/>pattern 2/{if(!/pattern/)print}' file
Or maybe:
Code:
awk '!(NR % 2)' RS=">pattern [12]\n" ORS="" file
 
Old 04-12-2011, 02:18 AM   #6
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Original Poster
Rep: Reputation: 1
@sycamorex
Thanx
Used a combination of sed n grep.. as I do not need the line containing pattern 2.

Last edited by Tauro; 04-12-2011 at 02:31 AM.
 
Old 04-12-2011, 03:39 AM   #7
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
Don't forget to mark as SOLVED once you have a solution.
 
Old 04-12-2011, 04:14 AM   #8
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Original Poster
Rep: Reputation: 1
Also, in the same file I have certain patterns that go on this way.

>pattern1
xyz
zz
sss
dd
>pattern2
ggg
ddd
aa
>pattern1
cwefw
swd
>pattern1 pattern2
ggg
ss
aaa
s
>pattern2

In this case, the sed one liner wont pick up the lines following ">pattern1 pattern2".. based on sed -n '/pattern1/,/pattern2/p' file.
 
Old 04-12-2011, 08:01 AM   #9
mayursingru
Member
 
Registered: Nov 2010
Location: Pune
Distribution: CentOS
Posts: 51

Rep: Reputation: 5
Hi Tauro,
Try this out
Code:
 sed -n '/pattern1/,/pattern2/p;/pattern1 pattern2/,/pattern2/p' file


Regards,
Mayur Singru
 
Old 04-12-2011, 08:15 AM   #10
kurumi
Member
 
Registered: Apr 2010
Posts: 228

Rep: Reputation: 46
using Ruby

Code:
$ ruby -0777 -ne 'puts $_.scan(/pattern 1(.*?)pattern 2/m)' file

xyz
xyz
abc
asdfg
>

adbf
sfni
>
 
Old 04-12-2011, 08:57 AM   #11
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,825
Blog Entries: 1

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
Quote:
Originally Posted by Tauro View Post
Also, in the same file I have certain patterns that go on this way.

>pattern1
xyz
zz
sss
dd
>pattern2
ggg
ddd
aa
>pattern1
cwefw
swd
>pattern1 pattern2
ggg
ss
aaa
s
>pattern2

In this case, the sed one liner wont pick up the lines following ">pattern1 pattern2".. based on sed -n '/pattern1/,/pattern2/p' file.

What about:

Code:
sed -n '/pattern1/,/>pattern2/p' file
 
Old 04-12-2011, 09:07 AM   #12
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,550

Rep: Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898Reputation: 2898
Quote:
In this case, the sed one liner wont pick up the lines following ">pattern1 pattern2".. based on sed -n '/pattern1/,/pattern2/p' file.
There are plenty of patterns that will not fit your original query. Also, you would have to explain again what you want to be the output, ie. should it display the single space
between '>pattern1 pattern2' or should it now display until '>pattern2' is found at the start of the line.
 
Old 04-12-2011, 09:11 AM   #13
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,825
Blog Entries: 1

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
As grail pointed out, it'd be helpful if you could provide us with more specific information (ideally posting the exact input file and how the output should look like)
 
Old 04-12-2011, 01:15 PM   #14
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Original Poster
Rep: Reputation: 1
@grail and sycamorex
Alright.. Here is what I specifically want. Below is 0.1% of my data.

>Q53HC2_HUMAN/218-253 PF10417.3;1-cysPrx_C;
ALQYVETHGEVCPANWTPDSPTIKPSPAASKEYFQK

>A4JFS8_BURVG/507-580 PF12796.1;Ank_2;
ACDAGDHYPLHLLVWKNDYRQLEKELQGQNVEAVDPRGRTLLHLAVSLGH
LESARVLLRHKADVTKENRQGWTVLHEAVSTGDPEMVYTVLQHRDYHNTS

>B4DZA5_HUMAN/287-857 PF04547.6;Anoctamin;
IRKYYGEKIGIYFAWLGYYTQMLLLAAVVGVACFLYGYLNQDNCTWSKEV
CHPDIGGKIIMCPQCDRLCPFWKLNITCESSKKLCIFDSFGTLVFAVFMG
VWVTLFLEFWKRRQAELEYEWDTVELQQEEQARPEYEARCTHVVIDEITQ
EEERIPFTAWGKCIRITLCASAVFFWILLIIASVIGIIVYRLSVFIVFSA

>ANFC_HUMAN/94-126 PF00212.12;ANP;
NARKYKGANKKGLSKGCFGLKLDRIGSMSGLGC

I need the lines containing HUMAN and the lines following it till it hits the next pattern ">".
When the third one is considered here, sed one liner picks up ' >B4DZA5_HUMAN... >ANFC_HUMAN' and not the line following ANFC_HUMAN.
I think I made it clear now.

Thnx in advance for helping
 
Old 04-12-2011, 02:59 PM   #15
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,825
Blog Entries: 1

Rep: Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221Reputation: 1221
Is there any common pattern in the pattern 2 lines (A4JFS8_BURVG/507-580 PF12796.1;Ank_2?
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
grep for multiple patterns???? lucastic Linux - Software 4 08-06-2010 07:07 PM
grepping all the lines between 2 patterns raghu123 Programming 2 11-04-2008 02:24 AM
commands for bash script that handles files of varying number of lines BBFeltham Linux - Newbie 1 07-26-2008 11:18 AM
grep patterns tekmann33 Linux - Newbie 2 07-14-2008 02:25 PM
how to grep 5k patterns at a time? xiawinter Linux - Software 9 12-29-2007 02:18 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 11:46 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration