[SOLVED] grep many files in multiple directories using patterns from a file
Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
grep many files in multiple directories using patterns from a file
Hi,
I have 6 folders in a directory called "1","2","3", and so on, and each directory contains various number of files (with a different number of files in each folder) called for ex. file1.info, file2.info, file3.info and so on.
I would like to grep the values listed in another file called "values.txt" from each of the files in each of the directories, and output a file "matches/txt" listing the output of grep and the name of the file where this value was found. I have done this before using "grep -E -w -f" followed by the name of the file containing patterns to match, but it is not working in this case. Can you help me figure out what I am doing wrong please?
This is an example of one of the files, say file6.info in directory 4:
SNP Al1 Al2 Freq1
rs10430105 G A 0.71429
chr1:46138916 A G 0.98380
chr1:46138957 A A 1.00000
chr1:46138962 A G 0.99999
chr1:46139026 T T 1.00000
chr1:46139108 T T 1.00000
chr1:46139147 A G 0.99918
rs12041677 C A 0.71429
chr1:46139365 C T 0.99741
chr1:46139412 T C 0.99582
chr1:46139456 G A 0.99719
rs340874 C T 0.53911
When I do this using only one of the pattern specified in values.txt, I get the correct match:
for numb in {1..6}; do
for file in $numb/file*.info; do
grep 'rs340874' $file >> matches.txt;
done; done
I get the correct match:
rs340874 C T 0.53911
But when I try using all the paterns in the file values.txt it does not work, and matches all of the records!
for numb in {1..6}; do
for file in $numb/file*.info; do
grep -E -w -f values.txt $file >> matches.txt;
done; done
Please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.
It appears to be working for me, given the above data. Your values.txt doesn't have dos-style line-endings, per chance?
BTW: I don't see any reason why you need to use the -E option here. Indeed, -F would probably be better.
Last edited by David the H.; 04-02-2012 at 06:20 AM.
Hi David, thank you for your reply. I have tried dos2linux but it still does not work.
I have also tried using only the -F option but nothing...Do you know what could be other causes of this problem?
Sorry. As I said, it works just fine for me, and I can't think of anything else that could keep it from working. You could try running the files through cat -A to see if there are any other non-printing characters messing things up.
The double-nested loop is rather klunky though. You can use the -r and --include options to search out the files you want, or even just use globbing or brace expansion to pass it the files directly.
Hi David, it works perfectly using what you proposed instead of the double loop! I really don't know what was wrong with my loops, but I guess it was wrong. Thank you very much.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.