LinuxQuestions.org
Welcome to the most active Linux Forum on the web.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-02-2012, 07:04 AM   #1
francy_casa
LQ Newbie
 
Registered: Sep 2011
Posts: 12

Rep: Reputation: Disabled
grep many files in multiple directories using patterns from a file


Hi,

I have 6 folders in a directory called "1","2","3", and so on, and each directory contains various number of files (with a different number of files in each folder) called for ex. file1.info, file2.info, file3.info and so on.
I would like to grep the values listed in another file called "values.txt" from each of the files in each of the directories, and output a file "matches/txt" listing the output of grep and the name of the file where this value was found. I have done this before using "grep -E -w -f" followed by the name of the file containing patterns to match, but it is not working in this case. Can you help me figure out what I am doing wrong please?

This is the values.txt file:

rs340874
rs6931514
rs10811661
rs8050136
rs1387153
rs7961581
rs1801214


This is an example of one of the files, say file6.info in directory 4:

SNP Al1 Al2 Freq1
rs10430105 G A 0.71429
chr1:46138916 A G 0.98380
chr1:46138957 A A 1.00000
chr1:46138962 A G 0.99999
chr1:46139026 T T 1.00000
chr1:46139108 T T 1.00000
chr1:46139147 A G 0.99918
rs12041677 C A 0.71429
chr1:46139365 C T 0.99741
chr1:46139412 T C 0.99582
chr1:46139456 G A 0.99719
rs340874 C T 0.53911

When I do this using only one of the pattern specified in values.txt, I get the correct match:

for numb in {1..6}; do
for file in $numb/file*.info; do
grep 'rs340874' $file >> matches.txt;
done; done

I get the correct match:
rs340874 C T 0.53911

But when I try using all the paterns in the file values.txt it does not work, and matches all of the records!

for numb in {1..6}; do
for file in $numb/file*.info; do
grep -E -w -f values.txt $file >> matches.txt;
done; done

Thank you very much for the help!
 
Old 04-02-2012, 07:17 AM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
Please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.

It appears to be working for me, given the above data. Your values.txt doesn't have dos-style line-endings, per chance?

BTW: I don't see any reason why you need to use the -E option here. Indeed, -F would probably be better.

Last edited by David the H.; 04-02-2012 at 07:20 AM.
 
Old 04-02-2012, 10:37 AM   #3
francy_casa
LQ Newbie
 
Registered: Sep 2011
Posts: 12

Original Poster
Rep: Reputation: Disabled
Hi David, thank you for your reply. I have tried dos2linux but it still does not work.
I have also tried using only the -F option but nothing...Do you know what could be other causes of this problem?

Thanks again/
 
Old 04-02-2012, 03:19 PM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958Reputation: 1958
Sorry. As I said, it works just fine for me, and I can't think of anything else that could keep it from working. You could try running the files through cat -A to see if there are any other non-printing characters messing things up.


The double-nested loop is rather klunky though. You can use the -r and --include options to search out the files you want, or even just use globbing or brace expansion to pass it the files directly.

Code:
grep -wF -f values.txt --include='file*.info' -r ./{1..6}

grep -wF -f values.txt ./[1-6]/file*.info
 
Old 04-12-2012, 09:49 AM   #5
francy_casa
LQ Newbie
 
Registered: Sep 2011
Posts: 12

Original Poster
Rep: Reputation: Disabled
Hi David, it works perfectly using what you proposed instead of the double loop! I really don't know what was wrong with my loops, but I guess it was wrong. Thank you very much.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
grep for common patterns in 2 files dann_radkov Linux - Newbie 7 03-07-2012 03:51 AM
Excluding multiple patterns from grep flamingo_l Linux - General 9 01-06-2011 09:05 AM
[SOLVED] grep Patterns from File twaddlac Programming 8 08-11-2010 12:44 AM
grep for multiple patterns???? lucastic Linux - Software 4 08-06-2010 07:07 PM
get files that match ALL patterns (using grep?) vigilandy Linux - General 5 06-09-2010 05:18 AM


All times are GMT -5. The time now is 03:17 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration