LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 06-26-2008, 06:52 AM   #1
elinenbe
LQ Newbie
 
Registered: Oct 2007
Posts: 23

Rep: Reputation: 15
create an error table? finding strings, and counting... in bash


I have a script that I wrote that searches an error log file for known errors, counts them, and then display statistics at the end. However it runs slow as molasses. I use grep and two loops to go through everything.

Here is an example of the file:

Code:
04/02/08:20:16:57 - y:\logs: 04/02/08 20:16:57.300 - No valid sum
04/03/08:05:04:38 - y:\logs: 04/03/08 05:04:38.759 - ID does not match
04/03/08:05:15:16 - y:\logs: 04/03/08 05:15:16.695 - Wrong Batch
04/03/08:05:26:41 - y:\logs: 04/03/08 05:26:41.461 - Unknown Exception
04/03/08:05:30:41 - y:\logs: 04/03/08 05:30:41.289 - I Am A Bad Error
04/03/08:06:00:58 - y:\logs: 04/03/08 06:00:58.633 - Wrong Batch
04/03/08:06:00:58 - y:\logs: 04/03/08 06:00:58.633 - Wrong Error
04/03/08:06:00:58 - y:\logs: 04/03/08 06:00:58.633 - Unknown Exception
04/03/08:06:00:58 - y:\logs: 04/03/08 06:00:58.633 - I Am A Bad Error
Now what I have is a list of acceptable errors:
Code:
okerror(
       "No valid sum"
       "ID does not match"
       "Wrong Batch"
       "Unknown Exception"
       )
When the script is run, I'd like an output file something like
Code:
OK Errors:
1 No valid sum
1 ID does not match
2 Wrong Batch
2 Unknown Exception
BAD Errors:
2 I Am A Bad Error
1 Wrong Error
The bad errors can be anything that is not in the okerror array. I just think that someone here could do something better than what I have, as it almost takes a second per line. I was thinking something along the lines of "grep -f" or something, but I just can't come up with something very elegant.

Thanks,
Eric
 
Old 06-26-2008, 08:59 AM   #2
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
I think a perl/python hash table based solution would probably be faster, but this might be fast enough. I got 35000 lines in 0.7 seconds (just your sample file duplicated). The only thing that annoys me is the need for a temp file, if only tee could send a copy to another process...

I assumed that the "-" is a delimiter, if it shows up in the error messages or the times/locations this won't work.

Code:
#!/bin/sh

okerror="No valid sum|ID does not match|Wrong Batch|Unknown Exception"

cut -d- -f3 logfile | sort | uniq -c > counts

echo OK Errors:
egrep "$okerror" counts

echo BAD Errors:
egrep -v "$okerror" counts
 
Old 06-26-2008, 09:53 AM   #3
elinenbe
LQ Newbie
 
Registered: Oct 2007
Posts: 23

Original Poster
Rep: Reputation: 15
WOW! This is super-slick.

Can you explain what this line does a little?

Code:
cut -d- -f3 logfile | sort | uniq -c > counts
I think I have it...

cut's each line of the logfile at the third dash, then sorts it, and counts the unique instances of each line. Very nice. It helps to know about these gnu utilities. So much for the crap I wrote.

Thanks!
 
Old 06-26-2008, 03:49 PM   #4
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,780

Rep: Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081Reputation: 2081
Quote:
think I have it...
yup, that's right.

You can always run each part of the pipeline separately to see what it does:
Code:
~/tmp$ cut -d- -f3 logfile
 No valid sum
 ID does not match
 Wrong Batch
 Unknown Exception
 I Am A Bad Error
 Wrong Batch
 Wrong Error
 Unknown Exception
 I Am A Bad Error
~/tmp$ cut -d- -f3 logfile | sort
 I Am A Bad Error
 I Am A Bad Error
 ID does not match
 No valid sum
 Unknown Exception
 Unknown Exception
 Wrong Batch
 Wrong Batch
 Wrong Error
~/tmp$ cut -d- -f3 logfile | sort | uniq -c
      2  I Am A Bad Error
      1  ID does not match
      1  No valid sum
      2  Unknown Exception
      2  Wrong Batch
      1  Wrong Error
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Need help with bash and strings Histamine Programming 4 06-27-2007 04:18 PM
bash: join strings kpachopoulos Programming 1 03-08-2007 02:25 PM
BASH: Output everything between two strings systemparadox Programming 2 12-18-2004 10:26 AM
bash and strings graziano1968 Linux - Software 2 10-01-2004 06:50 AM
adding strings in bash FireAge Linux - General 4 03-11-2003 10:57 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 08:11 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration