LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 09-07-2009, 06:59 PM   #1
kptkill
LQ Newbie
 
Registered: Apr 2007
Posts: 20

Rep: Reputation: 0
Question help counting # of lines for every 10 minutes


I need a good kick to get me started creating a bash/perl script to count the number of lines seen in a given 10 minute window. (And if there is a name for doing that, it may help my searching)

I will be parsing though various Apache logs, so the time format is
[10/Oct/2009:13:55:36 -0500]

To further throw a wrench in the cogs, I also cannot assume that the log file time will always be in order. Logs may go Monday, Wednesday, Tuesday, Thursday... etc.

Any help, or directions to other posts would be greatly appreciated!
 
Old 09-07-2009, 10:09 PM   #2
nadroj
Senior Member
 
Registered: Jan 2005
Location: Canada
Distribution: ubuntu
Posts: 2,539

Rep: Reputation: 60
if you want to run something automatically, or to "schedule" something, you probably want to use "cron". see this link http://en.wikipedia.org/wiki/Cron#Operators, it gives an example of how to run a script every 5 minutes so you should be able to easily modify it for your needs. all thats left is to write the script!

for counting the number of lines, you probably know of the "wc" program. if you need to filter out messages before counting the number of lines, ie "only messages from the last 10 minutes", then youll have to do some parsing/filtering (regex, grep, whatever).

im not sure what you mean by your last statement. my guess is you may see the following lines back to back:
Code:
[10/Oct/2009:13:20:36 -0500]
[11/Oct/2009:13:25:36 -0500]
of course these 2 lines dont happen within a 10 minute interval, so you have to write the logic for that. i would recommend a scripting language like perl to do the majority of your work, as it will be easier to handle this logic in perl rather than in some bash script (however i am biased towards perl).

let us know after you get something started and have some code that isnt working or if you still arent sure of where to start.
 
Old 09-08-2009, 03:21 AM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Debian
Posts: 8,578
Blog Entries: 31

Rep: Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208Reputation: 1208
Quote:
Originally Posted by kptkill View Post
I need a good kick to get me started creating a bash/perl script to count the number of lines seen in a given 10 minute window. (And if there is a name for doing that, it may help my searching)

I will be parsing though various Apache logs, so the time format is
[10/Oct/2009:13:55:36 -0500]

To further throw a wrench in the cogs, I also cannot assume that the log file time will always be in order. Logs may go Monday, Wednesday, Tuesday, Thursday... etc.

Any help, or directions to other posts would be greatly appreciated!
More information is necessary to define the requirement before a solution is possible. For starters (to illustrate the log file names and last modification time ordering) please post the output of the following command issued from a command prompt with the Apache logs directory as the current working directory
Code:
/bin/ls -lrt | tail -30
If the Apache logs are in the same directory as many other files, please add a filename pattern after "-lrt" so only the Apache logs are listed.

"10/Oct/2009:13:55:36" looks obvious but what does the " -0500" signify and does it always begin with a space?
 
Old 09-08-2009, 04:57 AM   #4
colucix
LQ Guru
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,509

Rep: Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983Reputation: 1983
I don't fully understand the requirement. I have intended it as "count the lines in a log file included in a time window of 10 minutes". If this is the case, I wrote a useful awk code some time ago. After having adapted the code to the correct log time format, I ended up with:
Code:
BEGIN { 
  inidate = "10 Oct 2009 14:15:00"
  enddate = "10 Oct 2009 14:25:00"
  ( "date -d \"" inidate "\" +%s" ) | getline initial_date
  ( "date -d \"" enddate "\" +%s" ) | getline final_date
  
  mm["Jan"] = 1
  mm["Feb"] = 2
  mm["Mar"] = 3
  mm["Apr"] = 4
  mm["May"] = 5
  mm["Jun"] = 6
  mm["Jul"] = 7
  mm["Aug"] = 8
  mm["Sep"] = 9
  mm["Oct"] = 10
  mm["Nov"] = 11
  mm["Dec"] = 12

}

{
  date = gensub(/^\[(.*)\].*/,"\\1","g")
  date = gensub(/^(.*) .*/,"\\1","g",date)
  date = gensub(/\/|:/," ","g",date)
  split(date,d)
  date = sprintf("%s %s %s %s %s %s",d[3],mm[d[2]],d[1],d[4],d[5],d[6])
  current_date = mktime(date)
  if ( current_date >= initial_date && current_date <= final_date )
    print
}
This prints out the lines included in the 10-minutes time window defined in the BEGIN section. To simply count them, just substitute the last part of the code:
Code:
    print
}
with
Code:
    count++
}
END { print count }
Forgive me if this is out of topic.
 
Old 09-08-2009, 01:01 PM   #5
kptkill
LQ Newbie
 
Registered: Apr 2007
Posts: 20

Original Poster
Rep: Reputation: 0
Thank you all for your input. To clarify, I what I am trying to do would take

[10/Oct/2009:13:15:36 -0500] foo
[10/Oct/2009:13:15:36 -0500] bar
[10/Oct/2009:13:25:36 -0500] foo
[10/Oct/2009:13:25:36 -0500] blah
[10/Oct/2009:13:25:36 -0500] blah
[10/Oct/2009:13:35:36 -0500] stuff

and output (which I can figure out later)
13:00-13:09 = 2 lines
13:10-13:19 = 3 lines
13:29-13:29 = 1 lines

I like some of the ideas I see so far and will defiantly try to run with them.
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
a pipeline for counting the lines in a directory fara77 Linux - Newbie 1 03-05-2007 06:00 PM
counting number of lines inside a directory structure. vl@d Linux - General 4 11-20-2006 12:50 PM
Asterisk box to connect VoIP lines with normal telephone lines. sraju Linux - Software 1 11-08-2006 12:38 AM
Counting Lines ej25 Programming 20 12-06-2004 02:08 PM
counting the commented lines using awk [ /* */] itsjvivek Linux - General 8 01-17-2003 08:30 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:49 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration