Help with a UNIX script: Need to grab information based on time from a file
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Help with a UNIX script: Need to grab information based on time from a file
Hey everyone.
Hoping to get some help with a UNIX script I am trying to create.
Basically, I have a file that I need to grep through every 15-30 minutes and look for certain errors. When these errors are found, a email is fired off to a email distribution list.
I have created a script that does everything that I need to. However, there are few other caveats.
1.) I cannot touch the file at all. Meaning, I can not create the file, create a temporary file from it, "reset" the original and get back to it.
Our company policy will not let us modify the original file at all.
That is my biggest drawback on completing this script.
So, I was trying to find out how I can make this script only find any new errors that happen within a specified time frame. For example, if I setup the script on a cronjob to run every 30 minutes, how can I make so if the script fires off at 3pm, it will find any errors that have occured since 2:30pm?
I really just need to find the new errors because other wise, it grabs any of the errors that have ever occured.
I think the easiest solution is not to have a cronjob at all. Read from the file in “follow mode” (i.e., with tail -f). Read from the pipe one line at a time and send off a mail for each error found. For example,
Code:
tail -n +1 -f the_file | grep 'errorpattern' | while read line; do
…
# Code to format and send email with affected line as $line
…
done
Sorry, I posted before reading your second post. You originally said you are unable to make temporary files, but above you say that you can do this. If you have a temporary file with the matching lines which you can save across the cronjobs, then you can just use comm (or diff) to get the new lines.
How about making the script get the size of the file you're checking. Every time the script runs, it should only analyse the last n bytes, where n is the amount of bytes added since last checked?
Will only work if the file is ever-growing.. sort of. Just my little idea.
Sorry, I posted before reading your second post. You originally said you are unable to make temporary files, but above you say that you can do this. If you have a temporary file with the matching lines which you can save across the cronjobs, then you can just use comm (or diff) to get the new lines.
Yes. I have to leave the original file in place. I cannot modify it all. I should have explained that a little better.
But, I could copy that file to a temp file. That would allow me to use the copy to do my work against, maybe do some more other things.
But, I could copy that file to a temp file. That would allow me to use the copy to do my work against, maybe do some more other things.
Thoughts?
Yes, you could, but you have not commented on the other solutions posed. In particular, if you have tail continuously look for changes to the file, you will get the offending lines as they occur (and you don’t need a cronjob, but leave the script running all the time like a daemon).
The other solution is Maligree’s in which you could save the number of lines (or bytes, but lines is easier to deal with) to a counter file, which would be persistent through each run of the cronjob. The script would look something like this:
Code:
if [ -e countfile ]; then
count=$(<countfile)
else
count=0
fi
errors=$(tail -n +$((count+1)) the_file | grep 'errorpattern')
if [ -n "$errors" ]; then
printf "The following errors were found\n\n$errors" | mailx -S 'Blah blah blah' email@email.com
fi
wc -l the_file | cut -d' ' -f1 > countfile
But you have to have permission to write to countfile.
Last edited by osor; 05-20-2008 at 07:03 PM.
Reason: oops
I've received some more guidance from my manager and it remains the same. Does not want to alter the original file at all. leave it alone were the exact words.
Perl *might* be a possibility. requires our SA's to install some things, so it is being considered.
One thing I came across when working on this is that when I egrep through the file, looking for my string, it works. however, what I noticed is that the information is on three lines IN the original file.
Line 1 - Is the timestamp. When the error occured.
Line 2 - The error code.
Line 3 - The contents of the error.
So now I need to grab all three lines. Can I do that with grep? Or do I have use something else?
I am still working around some things, and will try the countfile as
suggested above.
I also have been thinking about what was suggested about using the diff command as well.
Last one I was thinking about was getting a line count on the actual file itself. However, not sure how to proceed here.
Anyone have further suggestions? i feel like I have hit a brick wall lately.
I'd definitely go with Perl in that case, this is getting a bit fiddly to do reliably in bash.
Note that the Perl soln just reads the file, no alterations involved, so calm the mgr down.
You could handroll it in Perl using stat(), which I've also done myself (before I came across that module), but it's fiddly and the module I mentioned handles it all for you cleanly.
So now I need to grab all three lines. Can I do that with grep? Or do I have use something else?
Check to see what kind of grep you have available. The GNU version of grep has all sorts of bells and whistles (including context lines) which you can read about in the man page. Otherwise, awk and perl can do the same.
Quote:
Originally Posted by dba_guy
I am still working around some things, and will try the countfile as
suggested above.
I also have been thinking about what was suggested about using the diff command as well.
I think in terms of ease of implementation the strategies are (from easiest to hardest):
Tail follow for continuous monitoring
Cron job with persistent file with linecount of file from previous job.
Cron job with persistent copy of previous job’s error list
Cron job with persistent copy of file from previous job.
Notice that the first requires the least in terms of keeping any record on disk, and it is the most preferable (IMHO).
Quote:
Originally Posted by dba_guy
Last one I was thinking about was getting a line count on the actual file itself. However, not sure how to proceed here.
That’s a fairly standard grep option (see manual).
Gnu grep does have flags to grab n number of preceding or following lines, if you want to go that route, and it can also get you a line count.
I'd have to stick in the Perl camp on this one, if this keeps going and turning into a combination of external file information gathering and internal data mangling, ad naseum.
Your SA's should be okay with installing Perl. It comes prepackaged for most distro's and, if there's a security concern, they can compile it from source with the -DPARANOID flag or insist that they could control all scripts, or wrap their custom Perl, so that it's always running with Taint checking (-T).
If I were in your situation, at this point, I'd do as little as possible until somebody made up their mind
j.k. you do what you've gotta do. Best of luck to you
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.