Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Really new to unix so might not be able to explain my problem clearly but here goes.
I have two variables MONTH and YEAR and need to look through a directory containing 15 or so files for find lines that contain both variables. Once this is done I need the output to show the file name and the number of times both variables appear in each file.
So far I have my grep command at:
grep $MONTH ~/webhits/* | grep -c -H $YEAR
When I run this command I get an output of "(standard input):[the number of times both my variables appear in the whole directory] but I need it broken down into the number of times it appears in each file.
Thanks in advance for any help
Last edited by UnixNewbie91; 04-27-2012 at 05:07 PM.
Each file contains the number of web hits for a fake website from a number of different made up IP addresses. Some of the IP addresses appear twice in the same file so the hit from this IP address is counted multiple times. If I wanted to only count the hit from each IP once (so get the number unique hits) how would I do this.
Here is an example of the file if it helps
44.184.167.119 Mon May 07 08:11:50 GMT 2007
78.230.158.130 Thu May 10 01:59:33 GMT 2007
78.230.158.130 Thu May 10 05:14:58 GMT 2007
So for these three hits I want to count the hits from IP address 78.230.158.130 as one unique hit.
I believe that's getting out of the realm of bash/grep. Something like awk would probably be powerful enough to do it. You could either leave your current grep in tact and use awk to find the unique IPs from the match, or use awk to do both steps. I'm not an awk expert though, so I'll let somebody else chime in there.
Basically I have to write a script that looks at the files which are all set up like the example I gave and then put into a table the name of the folder, the number of hits in a given time frame which are sorted in descending order and then the number of unique hits (number of different IP addresses.
If you didn't use grep to count, but to only output records that match your
$YEAR then pipe that into the cut|sort combo above you could then count the
matching lines.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.