LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   How to create a bash script to count? (https://www.linuxquestions.org/questions/programming-9/how-to-create-a-bash-script-to-count-523568/)

Alinuxnoob 01-28-2007 03:13 PM

How to create a bash script to count?
 
I have no bash script knowledge.
Hi I was wondering how does one go abouts creating a bash script that will allow me to search a folder that will count how many times does a XXX letter show up in that folder. So I would need to search in that folder that contains txt's files that contains random letters for example "XXX, XYE, XE, Z, NO"

Script scans and finds

XE shows 5x
NO shows 1x

and maybe show only anything that has been found more than once....

raskin 01-28-2007 04:10 PM

I recommend reading info bash, info grep, info wc, info sort, info uniq.
Something like 'grep -o letter file' (-i if case independent) can be piped ("|") to "wc -l" to count occurences. It all can be inside "$( )" to be assigned to variable or to be first argument to "echo" with letter being second. You can do it inside "for" cycle over list of all words you search for and pipe entire output (put "|" after "done" without semicolon) to "sort -k 1n" and then possibly "egrep -v '^1 '" (single quotes inside are significant). You can iterate it over all files by wildcard and "for".

Alinuxnoob 01-28-2007 10:07 PM

I'm sorry I was wondering if there is free code that already does this that I can download. Don't have the time to learn how to write a bash script. Where can I find a script that does this in java or any other programming language.

sundialsvcs 01-28-2007 10:24 PM

"Patience, padewan ..." ;)

If you are truly anxious to receive results, then perhaps the first thing you need to recognize is that "an executable file" can be written in any language, not just the script language that is built-in to bash!

When the first line of a script-file starts with "#!" (known in unix circles as she-bang...), followed by the name of a program, this specifies the name of the program that should be executed in order to 'run' this script. So a script can be implemented in any language, not just "bash script."

And, having said that, let me be very quick to add that I personally have very little use whatsoever for "bash scripts!" :eek:

Most of my scripts start with this line:
Code:

#!/usr/bin/python
If your "language de jour" is Java ... cool! Just change the first line!

"Your mileage may vary," but that is exactly the point! Unix/Linux environments don't limit you to "just one way of doing things." (If you come from a Windows world, that takes a little getting used to.)

matthewg42 01-28-2007 10:48 PM

grep can search many files at once, but by default it will output the file name as well as the number of lines which matched the pattern you are looking for.
Code:

$ grep -c of *
test1:1
test2:4
test3:3
test4:3
test5:1

Note that this will not count multiple occurrences of the pattern on one line. Assuming this is OK, the question is now how to extract these numbers and add them up. You can ask grep not to print the file names using the -h option, or you could pass the output of "grep -c pattern *" through cut, chopping the lines on the delimiting character : and printing the second field. See the cut manual page for details.

Using -h is easier, and invokes less processes - a good rule of thumb when constructing shell scripts is to try to invoke as few processes as possible - this will improve performance.
Code:

$ grep -c -h of *
1
4
3
3
1

Right, so how to add it up? Well, Linux has lots of useful utilities which you can generally count on to be installed. Here's one method:
Code:

... | awk '{ total += $1 } END { print total }'
...which will read a list of numbers, and print out the sum of their values.
The thing about this is that awk is actually a pretty functional language interpreter, and if you're going to invoke it at all, you might as well use it to do the pattern matching as well and dispense with a shell script entirely. Doing this will also help you to check for multiple instances of your pattern.

One upon a time I used to write quite a lot of awk programs, some of which were non-trivial. Then, one day I put one of the simpler ones through a2p, which is an automatic code translation program which read an awk program, and spits out a perl program that (with luck) does the same thing.

This awk script I had tested it on operated on pretty large data sets, and took several minutes to run. The Perl version took about half as long to run. From that day on, I've pretty much used Perl instead of awk. Learn it - it's great.

OK, so here's how I'd do it in Perl:
Code:

#!/usr/bin/perl

use strict;
use warnings;

my $pattern = shift || die "you must specify a pattern as the first argument\n";
my $count = 0;

while(<>) {
        @_ = split(/\s+/);
        foreach my $word (@_) {
                # note, remove the i to make the check case sensitive
                if ( $word =~ /$pattern/i ) {
                        $count++;
                }
        }
}

print "total number of instance of $pattern is $count\n";

I hope you will do your own implementation.


All times are GMT -5. The time now is 09:46 PM.