LinuxQuestions.org
Support LQ: Use code LQ3 and save $3 on Domain Registration
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-21-2011, 06:49 AM   #1
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Rep: Reputation: 1
Awk to extract patterns till it hits blank line (in for loop)


I have a list of patterns in file1
Code:
10047134
10047140
100816392
100913026
100913028
100913192
...
..
file2
Code:
>gi|10047134|ref|
MMWQCHLSAQDYRYYPVDGYSLLKRFPLHPLTGPRCPVQTVGQWLESIGLPQYENHLMANGFDNVQFMGSNVMEDQDLLE
HRKRILASLGLRPPNEATASTPVQYWQHHPEKLIFQSCDYKAFYLGSMLIKELRGTESTQDACAKMRANCQKSTEQMKKVPTIILSVSYKGVKFIDATNKNIIAEHEIRNISCAAQDPEDLSTFAYITKDLK
SNHHYCHVF

>gi|10047140|ref
MESEMETQSARAEEGFTQVTRKGGRRAKKRQAEQLSAAGEGGDAGRMDTEEARPAKRPVFPPLCGDGLLSGKEETRKIPV
PANRYTPLKENWMKIFTPIVEHLGLQIRFNLKSRNVEIRTCKETKDVSALTKAADFVKAFILGFQVEDALALIRLDDLFL
ESFEITDVKPLKGDHL

>gi|100913028|ref|
MEVAEKLQLLNHRPVTAVEIQLMVEESEERLTEEQIEALLHTVTSILPAEPEAEQKKNTNSNVAMDEEDPA
What i want to do is, for each pattern in file1 pull out the line containing the pattern and the info below it from file2, till it hits blank line.

The awk one-liner I used works given a single pattern:
Code:
 awk 'BEGIN{RS=ORS="\n\n"; FS="\n"}/pattern/' infile
But..
Code:
for i in `cat file1`
> do
> awk 'BEGIN{RS=ORS="\n\n"; FS="\n"}/$i/' file2 >>Outfile
> done
...gives blank outfile.
Can you suggest a better way out ?!
 
Old 07-21-2011, 10:34 AM   #2
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,243

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
Well being it is a number of varying length, I presume you are defining the pattern with pipes as delimiters?

As for reading 2 files, pass file1 to awk inside your BEGIN and assign individual numbers to an array (normally you could do this in the script part instead but your
change in RS would read all of the first file), then check each number in second file for each record against the array.

The alternative is the way you started and simply use the -v option to assign $i to an awk variable. Of course the hit here is the awk is executed every time.
 
Old 07-21-2011, 10:55 AM   #3
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Original Poster
Rep: Reputation: 1
Putting in an array and matching records gives me the line containing the pattern.
How do i go about printing the lines below it? :|
 
Old 07-21-2011, 11:23 AM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976Reputation: 1976
Quote:
Originally Posted by Tauro View Post
How do i go about printing the lines below it? :|
Try getline inside a while loop, e.g.
Code:
awk -F"|" 'FNR == NR { pattern[$1]++; next } FNR < NR { if ( $2 in pattern ) { while ( $0 !~ /^$/ ) { print; getline } print "" } }' file1 file2
 
Old 07-21-2011, 01:46 PM   #5
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 9,243

Rep: Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684Reputation: 2684
Another alternative:
Code:
awk 'BEGIN{RS = ""; ORS = "\n\n"; FS = "\n"}FNR == NR{while(++i <= NF)a[$i]++;FS="|";next}$2 in a' file1 file2
 
Old 07-22-2011, 12:20 AM   #6
Tauro
LQ Newbie
 
Registered: Apr 2011
Posts: 24

Original Poster
Rep: Reputation: 1
Got it...!!!
Thanks a ton grail n colucix
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Awk to extract phrase between two words on a line? grob115 Programming 12 05-26-2010 10:46 PM
awk command line: blank line record sep, new line field sep robertmarkbram Programming 4 02-21-2010 06:25 AM
How to extract text blocks seperated by blank line art84_LV Programming 9 11-17-2009 02:02 AM
extract part of a line with sed or awk alirezan1 Linux - Newbie 2 10-01-2008 10:44 PM
grab the line below a blank line and the line above the next blank line awk or perl? Pantomime Linux - General 7 06-26-2008 09:13 AM


All times are GMT -5. The time now is 06:24 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration