LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 07-28-2010, 02:50 PM   #1
chargaff
LQ Newbie
 
Registered: Jul 2010
Posts: 7

Rep: Reputation: 0
storing grep output to feed awk to retrieve entire records matching variable


Hi everybody,

I'm quite new to scripting, and here is my first post on this forum !

I have two files :
FileA
prot1
prot5
prot9
prot15
...

FileB
###
prot1
xxxcxcxcvvv
###
prot2
xxxxccxcxcxc
###
prot3
xxdxdxgggbb
###
...

What I need to do is to extract from fileB the fields containing only the strings in fileA.

I thought awk could do the job easily with :
Code:
awk 'BEGIN { RS = "###" } /'$variable'/' fileB > output
where variable would maybe be the output of grep from fileA. So can I store the output of grep in a variable to use it afterwards with awk ?

something like that:
Code:
result=`grep prot. fileA` ; awk 'BEGIN { RS = "###" } /'$result'/' fileB > output
but that doesn't work. I'm always getting the entire fileB.
The output of grep get stored in the variable, I verified that with echo. So there is something that I just don't get... It seems to me that the above line should work.

Any help would be greatly appreciated. Other solultion as well, I know a bit of python.

Eric,
 
Old 07-28-2010, 03:06 PM   #2
kalleanka
Member
 
Registered: Aug 2003
Location: Mallorca, Spain
Distribution: xubuntu
Posts: 547

Rep: Reputation: 38
maybe like this:

result=$(grep prot. fileA)

awk 'BEGIN { RS = "###" } /"$result"/' fileB > output
 
Old 07-28-2010, 04:16 PM   #3
chargaff
LQ Newbie
 
Registered: Jul 2010
Posts: 7

Original Poster
Rep: Reputation: 0
thank you kalleanka for you reply.

Stroring de result of grep works both ways, with $( or with `

Interestingly, changing the surrounding of the varialbe change the behavior but it still doesn't work. With single quotes, I get an error, (wich I should have given you in my starting post) :

Quote:
awk: BEGIN { RS = "###" } /prot2
regular expression unterminated
/prot2 is what the first match should give.

With double quote I get nothing as output.

I fell like I'm very close but something simple is escaping me.
 
Old 08-11-2010, 06:54 AM   #4
chargaff
LQ Newbie
 
Registered: Jul 2010
Posts: 7

Original Poster
Rep: Reputation: 0
solved

Ok, I finaly found a simple solution, simply :

Code:
for line in $(grep '*' fileA)
do
    awk -v var=$line 'BEGIN { RS = "###" } $1==var fileB
done
the trick was to pass the shell variable to awk and then exactly match it in fileB.

Thanx !
 
Old 08-11-2010, 08:57 AM   #5
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,482

Rep: Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890
SO assuming I am looking at this right, awk still works:
Code:
awk 'f && $1 in a{printf $0}FNR == NR{for(;++i <= NF;)a[$i]++;f=1}' RS="###\n" fileA fileB
 
Old 08-11-2010, 09:04 AM   #6
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 241Reputation: 241Reputation: 241
Quote:
Originally Posted by chargaff View Post
Ok, I finaly found a simple solution, simply :

Code:
for line in $(grep '*' fileA)
do
    awk -v var=$line 'BEGIN { RS = "###" } $1==var fileB
done
the trick was to pass the shell variable to awk and then exactly match it in fileB.

Thanx !
you are forking alot of awk process for every line found with pattern. You can do this with just 1 awk command.
 
Old 08-12-2010, 06:45 AM   #7
chargaff
LQ Newbie
 
Registered: Jul 2010
Posts: 7

Original Poster
Rep: Reputation: 0
Grail, your line works perfectly and way much faster... Thanx !

I was aware that my solution was not effective... at least it worked.

I'll look at your line soon and if could just give me some hints on how it's constructed I'll appreciate that very much ! Thanks again.

Last edited by chargaff; 08-12-2010 at 06:48 AM.
 
Old 08-12-2010, 09:56 AM   #8
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,482

Rep: Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890Reputation: 1890
Sure ... its not that hard if you look at each piece:

1. RS="###\n" - Even though at the end of the command it is important to know the RS is being set prior to opening the files. The addition of the newline
character gives you a more correct count of the number of records

2. f && $1 in a{printf $0} - The order here is important as we only want to perform these tasks for fileB and not fileA. By default or variables in awk start
as the value 0 (or false) hence this is the value of 'f' until set to one later. Then we test if the first field exists as an index in array 'a'. When both true print the line.

3. FNR == NR{for(;++i <= NF;)a[$i]++;f=1} - FNR is the count of records that have been read in a file and is reset for each file. Nr is the count of all read records from all files. The for is a simple populate the array loop to fill array 'a' with indexes equal to the field values of the first file
 
1 members found this post helpful.
Old 08-13-2010, 06:10 AM   #9
chargaff
LQ Newbie
 
Registered: Jul 2010
Posts: 7

Original Poster
Rep: Reputation: 0
Wow, beautiful !

Thank for your time grail

eric

Last edited by chargaff; 08-13-2010 at 06:13 AM.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble storing a variable in a shell script after awk output uncle-c Linux - Newbie 3 02-08-2010 07:13 AM
[solved]Can't save grep output into variable deansaliba Linux - General 6 02-04-2010 06:58 AM
Sed/awk/grep search for number string of variable length in text file Alexr Linux - Newbie 10 01-19-2010 01:34 PM
Redirecting grep output to a variable dimako83 Linux - Software 6 11-17-2008 07:22 AM
storing output of sed in a variable in shell script Fond_of_Opensource Linux - Newbie 1 11-09-2006 03:57 AM


All times are GMT -5. The time now is 06:27 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration