LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   Help with shell script (https://www.linuxquestions.org/questions/programming-9/help-with-shell-script-4175465690/)

santoshi_natesan 06-12-2013 05:50 AM

Help with shell script
 
Hi,
need help in shell script especially in reading the log file. Have to grep a "error" from log file and copy to another file in a job which will be scheduled. During the next run of the job it should start looking for the error ignoring the last updated error on the file.

if this is the last updated erorr
2013-04-29 21:44:11,086 [ERROR] com.ct.gd.web.tags.AdSelectTag - Failed to get Ad Content.

then it should start looking from the log file from the time
2013-04-29 21:45

Confused at how to achieve this. Please help

konsolebox 06-12-2013 06:40 AM

You simply should use another file to record the last timestamp or info that the error was took, and use it next time to know where to start reading again. In bash alone or awk, you could (a bit slowly) loop through all the lines everytime you run the script to get the next error.
Code:

#!/bin/bash

shopt -s extglob || {
    echo "Unable to set extended glob."
    exit 1
}

LOGFILE=logfile.log
MARKERFILE=${LOGFILE}.mark
SOMEWHERE=somewhere.file

D='+([[:digit:]])'
DATETIMEPATTERN="${D}${D}${D}${D}-${D}${D}-${D}${D} ${D}${D}:${D}${D}:${D}${D},${D}${D}${D}"
MARKERFOUND=false
NEXTERRORFOUND=false

if read MARK < "$MARKERFILE" && [[ $MARK == $DATETIMEPATTERN ]]; then
    while read LINE; do
        if [[ $LINE == ${MARK}' [ERROR]'* ]]; then
            MARKERFOUND=true
            while read LINE; do
                if [[ $LINE == ${DATETIMEPATTERN}' [ERROR]'* ]]; then
                    NEXTERRORFOUND=true
                    NEWDATETIME=${LINE%%' [ERROR]'*}  ## assuming our lines don't have multiple " [ERROR]"s, which is unlikely.  Else better use awk.
                    break 2
                fi
            done
        fi
    done < "$LOGFILE"
fi

if [[ $MARKERFOUND == false ]]; then
    # Just read the next error from the start.
    # We could place this same/similar could in a function to prevent redundancy.
    while read LINE; do
        if [[ $LINE == ${DATETIMEPATTERN}' [ERROR]'* ]]; then
            NEXTERRORFOUND=true
            NEWDATETIME=${LINE%%' [ERROR]'*}
            break
        fi
    done < "$LOGFILE"
fi

if [[ $NEXTERRORFOUND == true ]]; then
    # Do the changes.
    echo "$LINE" >> "$SOMEWHERE"  ## Using >> to append and not overwrite.
    echo "$NEWDATETIME" > "$MARKERFILE"
fi

We could also exploit use of grep to have lesser code.

David the H. 06-13-2013 10:22 AM

I think you're in luck, in that the time stamp is basically in the proper ISO-8601 format. This means you should be able to just use standard string comparisons on them.

It would help a bit if you explained a bit more about exactly what the script is supposed to do; the way the code is structured can depend on it. But this is my proof of concept version:

Code:

#!/bin/bash

#define your file locations
#the first is the name of the file the timestamp is stored in
lastfile=/path/to/lastfile
logfile=/path/to/logfile
outfile=/path/to/outputfile

#get the last time stamp from the external file.  If no entry, use 0.
read -r lasttime <"$lastfile"
lasttime=${lasttime:-0}

#this is the error string to grep for.
errorstring='my error string'

#extract all lines with matching errors from the logfile and store in an array.
mapfile -t errmessages < <( grep "$errorstring" "$logfile" )

#now loop through the matched errors and print to the $outfile only
#the ones that have a timestamp less than the $lasttime
for line in "${errmessages[@]}"; do
    cur_stamp=${line:0:19}
    if [[ $cur_stamp < $lasttime || $cur_stamp == $lasttime ]]; then
        continue
    fi
    echo "$line"
done >"$outfile"

#save the most recent $lasttime to the $lastfile, overwriting the old one.
echo "$cur_stamp" >"$lastfile"

exit 0

I haven't tested it, but I think it should work. Note that the "${line:0:19}" expansion extracts the first 20 characters from the line, which I feel is safer than the other possible parameter substitutions.

It might possibly be slow-performing on large datasets too, although since it only loops over the lines extracted from grep, it presumably won't have too much work to do.

Still, if performance is a issue it may be better done in a language like perl.

konsolebox 06-13-2013 10:51 AM

Quote:

Originally Posted by David the H. (Post 4971057)
#now loop through the matched errors and print to the $outfile only
#the ones that have a timestamp less than the $lasttime

I guess your version is more dependent on time rather than the sequence. It would benefit consistency if somehow logs aren't placed synchronously like somehow being buffered before written, but that would also be troublesome if clock gets different.
Quote:

mapfile -t errmessages < <( grep "$errorstring" "$logfile" )
Just a suggestion to make it more compatible with earlier versions:
Code:

if [[ BASH_VERSINFO -ge 4 ]]; then
    mapfile -t errmessages
else
    errmessages=()
    while read -r line; do
        errmessages[${#errmessages[@]}]=$line
    done
fi < <(exec grep "$errorstring" "$logfile" )  # exec to prevent excess summoning of process

Quote:

Code:

[[ $cur_stamp < $lasttime ]]

And for this one, there are versions of bash that doesn't compare well with '<' or '>' if the second argument is not quoted so better:
Code:

[[ $cur_stamp < "$lasttime" ]]

David the H. 06-16-2013 12:18 PM

Quote:

Originally Posted by konsolebox (Post 4971074)
I guess your version is more dependent on time rather than the sequence. It would benefit consistency if somehow logs aren't placed synchronously like somehow being buffered before written, but that would also be troublesome if clock gets different.

I'm not sure I follow your meaning here. :scratch: The whole idea of my script is that it's not based directly on time, but text sorting order. As long as the stamps are formatted in some variation of YYYY-MM-DD HH:MM:SS, the older entry should always sort before the newer one.

Are you saying that we can't trust the log's timestamps for some reason?

I notice that I got the direction wrong in my script comment, though.

Quote:

Just a suggestion to make it more compatible with earlier versions:
Yes, this is a good idea. I sometimes overlook the fact that not everyone is using up-to-date software. :)

Quote:

And for this one, there are versions of bash that doesn't compare well with '<' or '>' if the second argument is not quoted so better:
Code:

[[ $cur_stamp < "$lasttime" ]]

This is another point I'm not sure about. AIUI, there have been some changes in how bash deals with locale-specific collation order, as addressed by the compat32 and compat40 shell options, but I haven't heard anything about quoting being involved. Do you have any details about this?

And would it affect the current test in any case?

konsolebox 06-17-2013 08:29 AM

Quote:

Originally Posted by David the H. (Post 4972944)
Are you saying that we can't trust the log's timestamps for some reason?

Well that's just the difference of our approach, nothing is really wrong. And yes sometimes we can't trust it like if two programs write two logs at the same time but the first one was written before the newer one since the first one was held on a buffer for some reasons and not written immediately. Also, a possibility could happen if the time suddenly gets different like on a reboot and/or battery failure, and later corrected. If a clock failure or bad setting occurs and a log happens in which the timestamp is older than the one that's currently marked then it won't be included in the processing.

Quote:

This is another point I'm not sure about. AIUI, there have been some changes in how bash deals with locale-specific collation order, as addressed by the compat32 and compat40 shell options, but I haven't heard anything about quoting being involved. Do you have any details about this?

And would it affect the current test in any case?
Not really critical perhaps I'm not sure. The only thing I'm certain is among older versions of bash when I was just creating one of my scripts the results were different and change from time to time. But I was able to fix it after placing the second argument on quotes. Perhaps it's somehow related to a similar handling of the second argument in == or =~. And I don't think it's related to the locale as no setting about it was changed during the fix.

It could be likely that strings inside quotes were handled differently than open strings on the second argument of < or >.

santoshi_natesan 06-18-2013 01:39 AM

Hi Konsolebox and David,

Thanks for the help. but my bash version is 3 so mapfile doesn't work on it. Also there are two type of error am looking as "Server Error" and Error. because "server error" doesn't have time stamp so the timestamp has to be taken from the next line.

posting my code written which is very simple as i am a beginner.


#!/bin/bash
h1=`hostname`
d1=`date`
ABCLOGS=/path to logfile
XYZlog1=/tmp
for fname in $ABCLOGS
do
abc=$(basename $fname)
f=${fname}/ABCLog.log.servererror
var=`tail -1000 $f | grep -A 20 "Server Error" `
echo "$var" >/tmp/abclogcopy.txt
done
for fname1 in $XYZlog1
do
xyz1=$(basename $fname1)
f1=${fname1}/xyzlogcopy.txt
var1=`cat $f1 | grep -c "Server Error" `
echo $var1
if [ "$var1" -ge "5" ] ;
then
echo "$var1"
if [ ! -z "$var" ] ;
then
echo "$var error on $xyz on server $h1 @ $d1" | mailx -s "error on $xyz" <mail-addr>
#break
else
echo "No matching error found on server $h1 @ $d1" >>/tmp/jobrun.txt
fi
else
echo "Count was not found"
fi
done

David the H. 06-19-2013 12:53 PM

I'm fairly sure my general idea could still be made to work.

mapfile is just a convenience feature that can be replaced with a while loop, as konsolebox demonstrated.

As for the second issue, an array may still be a way to go. Just make sure you grab all the important lines of output, then if an error line matches, you can simply grab the +1 entry and process that.

I'm also considering the possibility of using sed, with a range match including the last updated line, as a way to limit the input.

It would help if you could provide a longer example of the log text, and show us exactly what you want to get from it.


As for your posted script, I'm not going to break it down in detail at this time, but here are a few general scripting points that should be addressed in it, at least:

Don't Read Lines With For!
Useless Use Of Cat
$(..) is highly recommended over `..`
Parameter substitution can replace basename
[[..]] should generally be used for string/file tests, and ((..)) for numerical tests


And please use ***[code][/code]*** tags around your code and data, to preserve the original formatting and to improve readability. Do not use quote tags, bolding, colors, "start/end" lines, or other creative techniques. Thanks.

bonnydeal 07-16-2013 06:41 AM

You should start by analyzing your requirement.
IF the requirement is to copy ERROR messages from the logfile to a different file without repetition then you can simply start a shell background process

% nohup tail -f $LOGFILE | grep ERROR > $OUTFILE &


All times are GMT -5. The time now is 07:47 PM.