ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
See the following log file of apache (HTTPD).
I want a script which can calculate the busiest date. This file is long (very long) with the lines below:
I can do it with grep command as below, which works perfectly:
# less access.log |grep -c 03/Jan/2005
03
but with this I need to put the pattern manualli i.e. 03/Jan/2005.
Is there a good way to write this script which will automatically give me a busyiest day of the year. the file has more then 4 yrs access record so its only possible with a good script.
You said you were looking for the busiest day of the year, but this will of course give one of the busiest days of all four years. So, split the file into separate years using grep.
Instead of picking the last entry using "tail -1", you could look for all days that were as busy as this one.
The command doesn't care how many years are in access.log, it just takes one of the busiest dates in the entire file.
In view of your third question, remove "| tail -1" from the command and have a look at its output, I guess you'll see how to extract the information you want.
The command doesn't care how many years are in access.log, it just takes one of the busiest dates in the entire file.
In view of your third question, remove "| tail -1" from the command and have a look at its output, I guess you'll see how to extract the information you want.
Hi,
Above works but I am not getting the good results. the problem is with grep the counted date is suppose 20 times and with this script its only showing 18 times.
#less access.log |grep 20/April/2004 <enter>
230
So the actual results are 230 (no of lines for 20 aprils)
When I am doing with that script (mensioned above) see below
The reason is of course that you where only grepping for the date (20/April/2004, I guess it should rather be 20/Apr/2004), whereas the sed command looks for lines containing "- -[date:" and the word "GET". You'll have to say precisely what you want, if you're only interested in lines containing the date, either of the following will probably do, where the second is modelled after bigearsbilly's suggestion:
Code:
sed -ne "s|^.*\([0-9]\{2\}/[a-zA-Z]\{3\}/[0-9]\{4\}\).*$|\1|p"
sed -ne "s|^[^\[]*\[\([^:]*\):.*$|\1|p"
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.