LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   Need help with log files and browser statistics (https://www.linuxquestions.org/questions/linux-newbie-8/need-help-with-log-files-and-browser-statistics-4175573331/)

reginah72 02-26-2016 09:58 AM

Need help with log files and browser statistics
 
I am an extremely newbie to Linux. I'm taking a class on it right now but I am hopelessly lost in this project we are doing with log files. Our teacher never said anything within the class on log files so I have been basically looking them up online.

I have to find the number of visitors using MSIE, Firefox broken down by the version. Then I have to get number of people using any other browser. Now I can get it to the field that it's listed in by doing cut -f6 example.log I have tried several other things using grep,tr,cut, uniq, sort, and wc but I still can't get just the count of people by version of the browser. Can anyone please help me?

Here is a few lines of the log files we are using:

192.168.28.168 user143 [08/May/2010:09:52:52] "GET /NoAuth/js/scriptaculous/scriptaculous.js?load=effects,controls HTTP/1.1" "http://www.example.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

192.168.28.168 user147 [08/May/2010:09:52:52] "GET /NoAuth/js/prototype/prototype.js HTTP/1.1" "http://www.example.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

192.168.28.168 user174 [08/May/2010:09:52:52] "GET /NoAuth/js/ahah.js HTTP/1.1" "http://www.example.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

192.168.28.168 user82 [08/May/2010:09:52:52] "GET /NoAuth/js/titlebox-state.js HTTP/1.1" "http://www.example.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

192.168.28.168 user14 [08/May/2010:09:52:52] "GET /NoAuth/css/validation.css HTTP/1.1" "http://www.example.com/index.html" "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.2.3) Gecko/20100401 Firefox/3.6.3 GTB7.0"

hydrurga 02-26-2016 10:08 AM

The idea with anything like this is to cut it down into steps, starting simple, and then make it increasingly complex.

First of all, ignoring MSIE, how would you determine how many people had used Firefox?

reginah72 02-26-2016 10:23 AM

Well I know I would have to get the field to Firefox first and I'm pretty sure I would use something like cut -d I know the single ' ' but I have no clue what I would put in there. All the symbols confuse me even when I am sitting there with the book opened up to it. So I figure there would probably be 2 cut codes in there one that is to the field cut -f6 and the other cut -d. I started with grep example.log | cut -f6 | cut -d but I have no clue what else to put there besides sort and uniq -c.

hydrurga 02-26-2016 10:32 AM

Quote:

Originally Posted by reginah72 (Post 5506582)
Well I know I would have to get the field to Firefox first and I'm pretty sure I would use something like cut -d I know the single ' ' but I have no clue what I would put in there. All the symbols confuse me even when I am sitting there with the book opened up to it. So I figure there would probably be 2 cut codes in there one that is to the field cut -f6 and the other cut -d. I started with grep example.log | cut -f6 | cut -d but I have no clue what else to put there besides sort and uniq -c.

To find out how many people used Firefox, it is a simple grep, or a simple grep piped through a wc. There are no cuts involved.

Can you figure out either of these?

As I said, we will start simple and *then* make things more complex. Always try to break any complex task into simpler steps.

reginah72 02-26-2016 10:38 AM

okay so I used grep Firefox example.log | wc -l and it came back with the number 7871

hydrurga 02-26-2016 10:48 AM

Quote:

Originally Posted by reginah72 (Post 5506590)
okay so I used grep Firefox example.log | wc -l and it came back with the number 7871

Splendid. The other way of doing it is by using grep -c. It's always useful to have a quick read through the options for the main commands like grep to see if there are any that might prove useful.

Another thing to bear in mind is that there are often many ways to achieve the same result in Linux, due to the strength and versatility of the commands.

Ok. Next. Do a simple grep for Firefox but produce only the word Firefox on each line that is produced (i.e. the search term), not the whole line (hint: look at the grep options).

reginah72 02-26-2016 11:23 AM

Okay finally got that one it's grep -o Firefox example.log

hydrurga 02-26-2016 11:36 AM

Quote:

Originally Posted by reginah72 (Post 5506612)
Okay finally got that one it's grep -o Firefox example.log

Great. Ok, now we make it slightly more complex. Instead of just outputting "Firefox" for each grep hit, you want it to output the version too e.g. Firefox/3.7.3.

There may well be an easier way to do it, but using regular expressions in grep is a good way of going about this. In English, instead of just looking for "Firefox", you want to look for Firefox, followed by a forward slash /, followed by a group of characters that can be a number or a dot.

Grep uses regular expressions by default. However it's a better idea to put double quotes around the search term if things get more complicated than a simple alphanumeric expression (it's not necessary in *this particular* case but it is good practice). So,

Code:

grep "Firefox/"
finds Firefox followed by a forward slash. Try it.

All you have to do now is find out how to use a regular expression search for the number/dot group of characters as well.

hydrurga 02-26-2016 11:49 AM

A simple but useful tutorial:

https://www.digitalocean.com/communi...terns-in-linux

hydrurga 02-26-2016 11:57 AM

A simplified cheat sheet of the regexp operators:

http://ryanstutorials.net/linuxtutor...tsheetgrep.php

reginah72 02-26-2016 12:29 PM

got it
code:
grep "Firefox/.*" example.log
or I can still do the grep -o Firefox/.* example.log

now I have to sort it right?

Think I have it : grep "Firefox/.*" example.log | sort | uniq -c

hydrurga 02-26-2016 12:52 PM

Quote:

Originally Posted by reginah72 (Post 5506651)
got it
code:
grep "Firefox/.*" example.log
or I can still do the grep -o Firefox/.* example.log

now I have to sort it right?

Almost there. Have you tried it to see what output it gives?

The regular expression you have used looks for Firefox, followed by a forward slash, followed by 0 or more characters. So, it will output everything until the end of the line. Not what we wanted.

If you want to search for a collection of different characters, but don't care what order they're in, you use square brackets [].

For example, to search for any alphabetical character, lower or upper case, you use [a-zA-Z] (i.e. 2 ranges of characters).

If you want to search for any numerical digit, you use [0-9]

If you want to search for one of a combination of lowercase characters and digits, you use [a-z0-9]

You can also have other characters in there e.g. to search for a % or & symbol or the lowercase f, you can use [%&f] (it doesn't matter what order you put these in the square brackets). You can also mix ranges and individual characters in the same square brackets.

But these only look for *one* character from within the square brackets. To look for 0 or more, you put a * afterwards. so []*

Finally, a dot is one of the special characters in regular expressions, so if you want to look for an actual dot then you must "escape" it by putting a backslash in front of it e.g. \.

You have what you need. Can you come up with the grep command to find Firefox, followed by a forward slash /, followed by a group of characters that can be a number or a dot?

hydrurga 02-26-2016 12:54 PM

P.S. Excellent job with the

Code:

sort | uniq -c
!!!

reginah72 02-26-2016 01:20 PM

your right some of them have the (net clr and numbers after it) but others have Firefox and the version number.

So I did this one instead grep "Firefox/.[0-9.]* example.log | sort | uniq -c

So the MSIE would be about the same right? answered my own question I guess this one is harder

was that a , after the /? Firefox/,

hydrurga 02-26-2016 01:25 PM

Quote:

Originally Posted by reginah72 (Post 5506690)
your right some of them have the (net clr and numbers after it) but others have Firefox and the version number.

So I did this one instead grep "Firefox/.[0-9]* example.log | sort | uniq -c

So the MSIE would be about the same right?

Did you test the output? Can you please tail the last 5 lines of your

Code:

grep "Firefox/.[0-9]* example.log
and cut and paste it here?

To properly develop Linux commands and scripts, you need to be sitting in front of a keyboard and screen, testing what you're developing as you go along to see that it works correctly.

Edit: I see that you've now changed the grep command in your previous post. By all means cut and paste the output from that revised command.


All times are GMT -5. The time now is 02:17 AM.