filtering lines that begin with a numerical that has to be higher than a certain value.

frater · 02-02-2012, 05:57 AM

I often use this to find the occurence of an expression

grep -o <regexp> <file> | sort | uniq -c | sort -n

It can sometimes give me valuable output.

Piping it through 'tail' will only show me the highest value, but for scripting purposes I would like to have it filtered.
The value of the first column should be numerically more than a certain value.

It should of course show the whole line.

Guttorm · 02-02-2012, 08:24 AM

You can filter it with for example awk

Code:

... | awk '{ if ($1 > 10) print }'

This prints the whole line if the number of the first column is higher than 10.

frater · 02-03-2012, 02:04 AM

Thanks....

fukawi1 · 02-03-2012, 02:22 AM

or sed

Code:

sed '4,5!d' file

Where in this example, 4 is the start of the range (starting from zero, not 1), and 5 is the end range.

frater · 02-03-2012, 02:56 AM

Although it works with absolute values (10 in the example you gave me) I wouldn't know how to pass a variable to the awk expression in an elegant way

This works

Code:

... | awk '{ if ($1 > 10) print }'

This (of course) doesn't

Code:

AVG=10
... | awk '{ if ($1 > ${AVG}) print }'

@fukawi1:
I think you misunderstood what I want to achieve...
It should use the numerical value of the first column as the filter

Dark_Helmet · 02-03-2012, 03:20 AM

Quote:

Originally Posted by frater

Code:

AVG=10
... | awk '{ if ($1 > ${AVG}) print }'

The problem is with your quoting. Try:

Code:

awk "{ if (\$1 > ${AVG}) print }"

For bash, anything within single quotes is treated literally--no substitutions. Text within double-quotes is scanned for substitutions.

EDIT:
But I should point out that you'll need to protect the $1. Awk needs the $1 preserved for the script to work, but bash would want to substitute the first command line option for it. Hence, the backslash in front of the $, which should tell bash you want a literal $ and not a variable substitution.

frater · 02-03-2012, 04:15 AM

Thanks to you I could now modify my "regtop"
I'm using this script on my linux systems and they are executed remotely by Zabbix.
I can enter the "regexp" in Zabbix and am able to extract valuable data from logs.

This is how it worked before:

Code:

# regtop /opt/ASSP/logs/maillog.txt 'Connected: [0-9.]*' '' 1
   2110 Connected: 192.168.10.100
   1040 Connected: 85.214.254.20
     19 Connected: 216.34.181.88
      9 Connected: 80.254.173.44
      3 Connected: 208.101.3.244
Total: 3283 lines with "Connected: [0-9.]*"

Now it's this (and thanks to you):

Code:

# regtop /opt/ASSP/logs/maillog.txt 'Connected: [0-9.]*' '' 1
   2110 Connected: 192.168.10.100
   1040 Connected: 85.214.254.20
Total: 3283 lines with "Connected: [0-9.]*"

This is the code.
Maybe (and hopefully) you can use it too:

# cat /usr/local/sbin/regtop

Code:

#!/bin/bash
#####################################################
# regtop
#####################################################
# Uses logtail & readlink
# http://sourceforge.net/projects/logtail/
#####################################################
# echo 'zabbix ALL =(ALL) NOPASSWD: /usr/local/sbin/regtop' >>/etc/sudoers
#
# grep -iq 'UnsafeUserParameters' /etc/zabbix/zabbix_agentd.conf || sed -i -e 's/^Server.*/&\n\n# Allow regular expressions\nUnsafeUserParameters=1/' /etc/zabbix/zabbix_agentd.conf
#
# echo 'UserParameter=vfs.file.regtop[*],  sudo /usr/local/sbin/regtop   "$1" "$2" "$3" "$4" "$5" "$6"' >>/etc/zabbix/zabbix_agentd.conf
#####################################################
# 08-12-2010 by Frater
#
# The 4th parameter (minutes) is optional.
# When 0 or empty, it will use 'logtail' which only checks the portion which hasn't been parsed before
# When minutes is 1, it will take the whole file
# When it's greater than 1, it will use 'lastmins' that will only output the last x minutes
#####################################################
export PATH=${PATH}:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin
offset=/tmp/regtop.

v=
while getopts v name
do
  case $name in
    v)   v='-v ';;
    ?)   printf "Usage: %s: [-v] <file> <Eregexp>,  [<Eregexp>] , [ <minutes> ]\n" $0
    exit 2;;
  esac
done
shift $(($OPTIND - 1))

[ ! -h "$1" ] && [ ! -f "$1" ] && exit 1
[ -z "$2" ]   && exit 1
file2parse="$1"

ftmp1=`mktemp`
ftmp2=`mktemp`

minutes=`echo "$4" | awk -F. '{print $1}' | tr -cd '0-9'`
[ -z "${minutes}" ] && minutes=0

if [ ${minutes} -eq 0 ] ; then
  fname="`readlink -f "$1"`"
  expression="`echo "${fname}.${v}$2$3" | tr '/' '.' | tr -cd '.0-9A-Za-z-'`"
  offset="${offset}${expression}.offset"
  logtail -f "$1" -o $offset >${ftmp2}
  file2parse=${ftmp2}
elif [ ${minutes} -gt 1 ] ; then
  cat "$1" | lastmins ${minutes} >${ftmp2}
  file2parse=${ftmp2}
fi

if [ ! -z "$3" ] ; then
  ftmp3=`mktemp`
  grep $v -E "$3" ${file2parse} >${ftmp3}
  file2parse=${ftmp3}
fi

grep -oE "${2}" ${file2parse} | sort -o ${ftmp1}

total=`wc -l ${ftmp1} | awk '{print $1}'`
if [ $total -gt 0 ] ; then

  rows=`echo "${5}" | tr -cd '0-9'`
  [ -z "${rows}" ] && rows=5
  expression="${2}"

  [ -z "$3" ] || expression="${expression} && $3"
  # get double the amount of rows you want to see (2 * 5 = 10)
  uniq -c ${ftmp1} | sort -rn | head -n$((2 * ${rows})) >${ftmp2}
  # calculate the average value of these lines
  AVG=`cat ${ftmp2} | awk '{avg += $1}END{printf "%d\n", avg/NR}'`
  # Only show the rows that have more than average, but no more than the amount of lines you wanted (default = 5)
  head -n${rows} ${ftmp2} | awk "{ if (\$1 >= ${AVG}) print}"
  echo "Total: $total lines with \"${expression}\""
else
  echo '-'
fi

rm -f ${ftmp1} 2>/dev/null
rm -f ${ftmp2} 2>/dev/null
rm -f ${ftmp3} 2>/dev/null

David the H. · 02-03-2012, 11:48 AM

Quote:

Originally Posted by fukawi1

or sed

Code:

sed '4,5!d' file

Where in this example, 4 is the start of the range (starting from zero, not 1), and 5 is the end range.

Sorry, no. All this does is specify a range of line numbers to operate on (in this case delete all except lines 4 and 5).

sed does not have the ability to do value comparisons on the contents, only pattern matching, like grep. For that you need awk or a similar tool.

ntubski · 02-03-2012, 04:28 PM

Code:

awk "{ if (\$1 > ${AVG}) print }"

awk has the -v option for this sort of thing:

Code:

awk -v AVG="$AVG" '{ if ($1 > AVG) print }'
# or just
awk -v AVG="$AVG" '($1 > AVG)'