Help with a UNIX script: Need to grab information based on time from a file

dba_guy · 05-20-2008, 04:35 PM

Hey everyone.
Hoping to get some help with a UNIX script I am trying to create.

Basically, I have a file that I need to grep through every 15-30 minutes and look for certain errors. When these errors are found, a email is fired off to a email distribution list.

I have created a script that does everything that I need to. However, there are few other caveats.

1.) I cannot touch the file at all. Meaning, I can not create the file, create a temporary file from it, "reset" the original and get back to it.
Our company policy will not let us modify the original file at all.

That is my biggest drawback on completing this script.

So, I was trying to find out how I can make this script only find any new errors that happen within a specified time frame. For example, if I setup the script on a cronjob to run every 30 minutes, how can I make so if the script fires off at 3pm, it will find any errors that have occured since 2:30pm?

I really just need to find the new errors because other wise, it grabs any of the errors that have ever occured.

Anyone have suggestions?

This one is bugging me.

Many tanks.

DBA_GUY

dba_guy · 05-20-2008, 05:05 PM

Actually, have an idea. Need some feedback on it.

What if I create a loop?
My original script 'egreps' through a file searching for certain strings. It than cats this output to a text file.

I then have a if statement that would say "If it found anything, then:

cat file.xt | mailx -s "Blah blah blah" email@email.com

So can I put the grep inside the loop?

Something like:

for
egrep -e <strings go here> file.log (can I save the contents to a variable?...something like $errors)

if $errors <= Date (minus 15-30minutes)

then mailx -S 'Blah blah blah'

else

fi

Probably makes zero sense. But it makes sense in my brain.

I'll try and fletch out a rough copy of the code.

Thanks,

DBA_GUY

PS. Im a rook at UNIX scripting.

osor · 05-20-2008, 05:09 PM

I think the easiest solution is not to have a cronjob at all. Read from the file in “follow mode” (i.e., with tail -f). Read from the pipe one line at a time and send off a mail for each error found. For example,

Code:

tail -n +1 -f the_file | grep 'errorpattern' | while read line; do
…
# Code to format and send email with affected line as $line
…
done

osor · 05-20-2008, 05:14 PM

Sorry, I posted before reading your second post. You originally said you are unable to make temporary files, but above you say that you can do this. If you have a temporary file with the matching lines which you can save across the cronjobs, then you can just use comm (or diff) to get the new lines.

Maligree · 05-20-2008, 05:15 PM

How about making the script get the size of the file you're checking. Every time the script runs, it should only analyse the last n bytes, where n is the amount of bytes added since last checked?

Will only work if the file is ever-growing.. sort of. Just my little idea.

dba_guy · 05-20-2008, 05:25 PM

Quote:

Originally Posted by osor

Sorry, I posted before reading your second post. You originally said you are unable to make temporary files, but above you say that you can do this. If you have a temporary file with the matching lines which you can save across the cronjobs, then you can just use comm (or diff) to get the new lines.

Yes. I have to leave the original file in place. I cannot modify it all. I should have explained that a little better.

But, I could copy that file to a temp file. That would allow me to use the copy to do my work against, maybe do some more other things.

Thoughts?

osor · 05-20-2008, 06:44 PM

Quote:

Originally Posted by dba_guy

But, I could copy that file to a temp file. That would allow me to use the copy to do my work against, maybe do some more other things.

Thoughts?

Yes, you could, but you have not commented on the other solutions posed. In particular, if you have tail continuously look for changes to the file, you will get the offending lines as they occur (and you don’t need a cronjob, but leave the script running all the time like a daemon).

The other solution is Maligree’s in which you could save the number of lines (or bytes, but lines is easier to deal with) to a counter file, which would be persistent through each run of the cronjob. The script would look something like this:

Code:

if [ -e countfile ]; then
	count=$(<countfile)
else
	count=0
fi

errors=$(tail -n +$((count+1)) the_file | grep 'errorpattern')

if [ -n "$errors" ]; then
	printf "The following errors were found\n\n$errors" | mailx -S 'Blah blah blah' email@email.com
fi

wc -l the_file | cut -d' ' -f1 > countfile

But you have to have permission to write to countfile.

chrism01 · 05-20-2008, 06:50 PM

Well, Perl has a module to do just that: http://search.cpan.org/~mgrabnar/Fil...0.99.3/Tail.pm
I've used to tail/follow the OpenNMS log successfully.

dba_guy · 05-21-2008, 12:15 PM

Appreciate the help.
Gives me some things to tinker around with.

let me go back and try a few things out. I will be back to post my results.

Thanks.

eggixyz · 05-21-2008, 01:53 PM

Hey There,

You can also use Perl's stat() function to get the date on the file. It won't reset the time.

Best wishes,

Mike

dba_guy · 05-28-2008, 04:46 PM

Still working on this and still stuck.

I've received some more guidance from my manager and it remains the same. Does not want to alter the original file at all. leave it alone were the exact words.

Perl *might* be a possibility. requires our SA's to install some things, so it is being considered.

One thing I came across when working on this is that when I egrep through the file, looking for my string, it works. however, what I noticed is that the information is on three lines IN the original file.

Line 1 - Is the timestamp. When the error occured.
Line 2 - The error code.
Line 3 - The contents of the error.

So now I need to grab all three lines. Can I do that with grep? Or do I have use something else?

I am still working around some things, and will try the countfile as
suggested above.

I also have been thinking about what was suggested about using the diff command as well.

Last one I was thinking about was getting a line count on the actual file itself. However, not sure how to proceed here.

Anyone have further suggestions? i feel like I have hit a brick wall lately.

Cheers

chrism01 · 05-28-2008, 06:04 PM

I'd definitely go with Perl in that case, this is getting a bit fiddly to do reliably in bash.
Note that the Perl soln just reads the file, no alterations involved, so calm the mgr down.
You could handroll it in Perl using stat(), which I've also done myself (before I came across that module), but it's fiddly and the module I mentioned handles it all for you cleanly.

osor · 05-28-2008, 07:19 PM

Quote:

Originally Posted by dba_guy

So now I need to grab all three lines. Can I do that with grep? Or do I have use something else?

Check to see what kind of grep you have available. The GNU version of grep has all sorts of bells and whistles (including context lines) which you can read about in the man page. Otherwise, awk and perl can do the same.

Quote:

Originally Posted by dba_guy

I am still working around some things, and will try the countfile as
suggested above.

I also have been thinking about what was suggested about using the diff command as well.

I think in terms of ease of implementation the strategies are (from easiest to hardest):

Tail follow for continuous monitoring
Cron job with persistent file with linecount of file from previous job.
Cron job with persistent copy of previous job’s error list
Cron job with persistent copy of file from previous job.

Notice that the first requires the least in terms of keeping any record on disk, and it is the most preferable (IMHO).

Quote:

Originally Posted by dba_guy

Last one I was thinking about was getting a line count on the actual file itself. However, not sure how to proceed here.

That’s a fairly standard grep option (see manual).

eggixyz · 05-28-2008, 10:01 PM

Yes,

Gnu grep does have flags to grab n number of preceding or following lines, if you want to go that route, and it can also get you a line count.

I'd have to stick in the Perl camp on this one, if this keeps going and turning into a combination of external file information gathering and internal data mangling, ad naseum.

Your SA's should be okay with installing Perl. It comes prepackaged for most distro's and, if there's a security concern, they can compile it from source with the -DPARANOID flag or insist that they could control all scripts, or wrap their custom Perl, so that it's always running with Taint checking (-T).

If I were in your situation, at this point, I'd do as little as possible until somebody made up their mind

j.k. you do what you've gotta do. Best of luck to you

, Mike