Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
i need some advice on a unix command.
see i have input.txt
keyword1
keyword2
etc
i want to perform the grep command for each keyword in the input.txt file and save result in output.txt
grep -c "keyword1" file.txt
can you please give me a command to perform this action.
A fascinating example demonstrating a useless usage of both cat and echo in a single line.
2) You should be more careful with quoting your variables. You quote $searched_file and $output_file, but it doesn't make much difference because you don't quote $2 and $3.
3) why do you count the lines? The count variable is not used anywhere.
A fascinating example demonstrating a useless usage of both cat and echo in a single line.
2) You should be more careful with quoting your variables. You quote $searched_file and $output_file, but it doesn't make much difference because you don't quote $2 and $3.
3) why do you count the lines? The count variable is not used anywhere.
Great!! Thanks for correcting my mistakes. i have just started learning scripts (3-4 weeks young) still long way to go...
i want to perform a grep -c command for each keyword in input.txt and save the resuts of grep in output.txt
grep -c "aaa" file.txt
grep -c "bbb" file.txt
output.txt is
aaa 2[means how many lines matches the given keyword 'aaa' in file.txt]
bbb 1
i can do this manually but the files have million keywords each.
sorry about this confusion guys.
hope you can help
input.txt is a million keywords
keyword1
keyword2
.
.
.
.
i want a grep -c "keyword1" file.txt for all keywords in input.txt and save 'number of occurences of each keyword in the file'
so output.txt will be
keyword1 334
keyword 3342
keyword 6644
Well, that's what my first example does. I can imagine that a search of a million keywords takes a while to sort, though. How large is the file you search in? How many occurrences total do you expect to be there? millions? more? In my example, I use grep -f, which searches for all the keywords in the same time. That's much faster than grepping million times for each keyword. On the other hand, my solution then sorts all the found occurences so I can use uniq -c on the result. That might be quite slow if there's a lot of matches.
The solution provided by mddesai is less efficient to grep, but it does not have to sort the output. Just modify it to
Code:
while read line
do
grep -c "$line" file.txt
done <input.txt >output.txt
If you want a more efficient solution, I would try awk, but it would be more complicated.
OK, my idea of this in awk would be something like this:
Code:
awk '
NR == FNR { kw[$0]=0; }
NR != FNR {
for (w in kw) {
split($0,a,w);
kw[w]+= (length(a)-1);
}
}
END {
for(w in kw) {print w, kw[w];}
}
' input.txt file.txt >output.txt
Also, please note that there's a difference between "number of lines containing a pattern" and "number of occurences of a pattern in a file". This does the latter. I haven't tested it, but I hope it would be at least slightly faster than the grep solution posted earlier.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.