grep can search many files at once, but by default it will output the file name as well as the number of lines which matched the pattern you are looking for.
Code:
$ grep -c of *
test1:1
test2:4
test3:3
test4:3
test5:1
Note that this will not count multiple occurrences of the pattern on one line. Assuming this is OK, the question is now how to extract these numbers and add them up. You can ask grep not to print the file names using the -h option, or you could pass the output of "grep -c pattern *" through cut, chopping the lines on the delimiting character : and printing the second field. See the cut manual page for details.
Using -h is easier, and invokes less processes - a good rule of thumb when constructing shell scripts is to try to invoke as few processes as possible - this will improve performance.
Code:
$ grep -c -h of *
1
4
3
3
1
Right, so how to add it up? Well, Linux has lots of useful utilities which you can generally count on to be installed. Here's one method:
Code:
... | awk '{ total += $1 } END { print total }'
...which will read a list of numbers, and print out the sum of their values.
The thing about this is that awk is actually a pretty functional language interpreter, and if you're going to invoke it at all, you might as well use it to do the pattern matching as well and dispense with a shell script entirely. Doing this will also help you to check for multiple instances of your pattern.
One upon a time I used to write quite a lot of awk programs, some of which were non-trivial. Then, one day I put one of the simpler ones through a2p, which is an automatic code translation program which read an awk program, and spits out a perl program that (with luck) does the same thing.
This awk script I had tested it on operated on pretty large data sets, and took several minutes to run. The Perl version took about half as long to run. From that day on, I've pretty much used Perl instead of awk. Learn it - it's great.
OK, so here's how I'd do it in Perl:
Code:
#!/usr/bin/perl
use strict;
use warnings;
my $pattern = shift || die "you must specify a pattern as the first argument\n";
my $count = 0;
while(<>) {
@_ = split(/\s+/);
foreach my $word (@_) {
# note, remove the i to make the check case sensitive
if ( $word =~ /$pattern/i ) {
$count++;
}
}
}
print "total number of instance of $pattern is $count\n";
I hope you will do your own implementation.