LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 02-02-2012, 05:57 AM   #1
frater
Member
 
Registered: Jul 2008
Posts: 121

Rep: Reputation: 23
filtering lines that begin with a numerical that has to be higher than a certain value.


I often use this to find the occurence of an expression

grep -o <regexp> <file> | sort | uniq -c | sort -n

It can sometimes give me valuable output.

Piping it through 'tail' will only show me the highest value, but for scripting purposes I would like to have it filtered.
The value of the first column should be numerically more than a certain value.

It should of course show the whole line.
 
Old 02-02-2012, 08:24 AM   #2
Guttorm
Senior Member
 
Registered: Dec 2003
Location: Trondheim, Norway
Distribution: Debian and Ubuntu
Posts: 1,453

Rep: Reputation: 448Reputation: 448Reputation: 448Reputation: 448Reputation: 448
You can filter it with for example awk

Code:
... | awk '{ if ($1 > 10) print }'
This prints the whole line if the number of the first column is higher than 10.

Last edited by Guttorm; 02-02-2012 at 08:27 AM.
 
1 members found this post helpful.
Old 02-03-2012, 02:04 AM   #3
frater
Member
 
Registered: Jul 2008
Posts: 121

Original Poster
Rep: Reputation: 23
Thanks....
 
Old 02-03-2012, 02:22 AM   #4
fukawi1
Member
 
Registered: Apr 2009
Location: Melbourne
Distribution: Fedora & CentOS
Posts: 854

Rep: Reputation: 193Reputation: 193
or sed
Code:
sed '4,5!d' file
Where in this example, 4 is the start of the range (starting from zero, not 1), and 5 is the end range.
 
Old 02-03-2012, 02:56 AM   #5
frater
Member
 
Registered: Jul 2008
Posts: 121

Original Poster
Rep: Reputation: 23
Although it works with absolute values (10 in the example you gave me) I wouldn't know how to pass a variable to the awk expression in an elegant way

This works
Code:
... | awk '{ if ($1 > 10) print }'
This (of course) doesn't
Code:
AVG=10
... | awk '{ if ($1 > ${AVG}) print }'
@fukawi1:
I think you misunderstood what I want to achieve...
It should use the numerical value of the first column as the filter

Last edited by frater; 02-03-2012 at 03:05 AM.
 
Old 02-03-2012, 03:20 AM   #6
Dark_Helmet
Senior Member
 
Registered: Jan 2003
Posts: 2,786

Rep: Reputation: 374Reputation: 374Reputation: 374Reputation: 374
Quote:
Originally Posted by frater
Code:
AVG=10
... | awk '{ if ($1 > ${AVG}) print }'
The problem is with your quoting. Try:
Code:
awk "{ if (\$1 > ${AVG}) print }"
For bash, anything within single quotes is treated literally--no substitutions. Text within double-quotes is scanned for substitutions.

EDIT:
But I should point out that you'll need to protect the $1. Awk needs the $1 preserved for the script to work, but bash would want to substitute the first command line option for it. Hence, the backslash in front of the $, which should tell bash you want a literal $ and not a variable substitution.

Last edited by Dark_Helmet; 02-03-2012 at 03:24 AM.
 
Old 02-03-2012, 04:15 AM   #7
frater
Member
 
Registered: Jul 2008
Posts: 121

Original Poster
Rep: Reputation: 23
Thanks to you I could now modify my "regtop"
I'm using this script on my linux systems and they are executed remotely by Zabbix.
I can enter the "regexp" in Zabbix and am able to extract valuable data from logs.

This is how it worked before:

Code:
# regtop /opt/ASSP/logs/maillog.txt 'Connected: [0-9.]*' '' 1
   2110 Connected: 192.168.10.100
   1040 Connected: 85.214.254.20
     19 Connected: 216.34.181.88
      9 Connected: 80.254.173.44
      3 Connected: 208.101.3.244
Total: 3283 lines with "Connected: [0-9.]*"
Now it's this (and thanks to you):
Code:
# regtop /opt/ASSP/logs/maillog.txt 'Connected: [0-9.]*' '' 1
   2110 Connected: 192.168.10.100
   1040 Connected: 85.214.254.20
Total: 3283 lines with "Connected: [0-9.]*"
This is the code.
Maybe (and hopefully) you can use it too:

# cat /usr/local/sbin/regtop
Code:
#!/bin/bash
#####################################################
# regtop
#####################################################
# Uses logtail & readlink
# http://sourceforge.net/projects/logtail/
#####################################################
# echo 'zabbix ALL =(ALL) NOPASSWD: /usr/local/sbin/regtop' >>/etc/sudoers
#
# grep -iq 'UnsafeUserParameters' /etc/zabbix/zabbix_agentd.conf || sed -i -e 's/^Server.*/&\n\n# Allow regular expressions\nUnsafeUserParameters=1/' /etc/zabbix/zabbix_agentd.conf
#
# echo 'UserParameter=vfs.file.regtop[*],  sudo /usr/local/sbin/regtop   "$1" "$2" "$3" "$4" "$5" "$6"' >>/etc/zabbix/zabbix_agentd.conf
#####################################################
# 08-12-2010 by Frater
#
# The 4th parameter (minutes) is optional.
# When 0 or empty, it will use 'logtail' which only checks the portion which hasn't been parsed before
# When minutes is 1, it will take the whole file
# When it's greater than 1, it will use 'lastmins' that will only output the last x minutes
#####################################################
export PATH=${PATH}:/usr/local/sbin:/sbin:/usr/sbin:/bin:/usr/bin
offset=/tmp/regtop.

v=
while getopts v name
do
  case $name in
    v)   v='-v ';;
    ?)   printf "Usage: %s: [-v] <file> <Eregexp>,  [<Eregexp>] , [ <minutes> ]\n" $0
    exit 2;;
  esac
done
shift $(($OPTIND - 1))

[ ! -h "$1" ] && [ ! -f "$1" ] && exit 1
[ -z "$2" ]   && exit 1
file2parse="$1"

ftmp1=`mktemp`
ftmp2=`mktemp`

minutes=`echo "$4" | awk -F. '{print $1}' | tr -cd '0-9'`
[ -z "${minutes}" ] && minutes=0

if [ ${minutes} -eq 0 ] ; then
  fname="`readlink -f "$1"`"
  expression="`echo "${fname}.${v}$2$3" | tr '/' '.' | tr -cd '.0-9A-Za-z-'`"
  offset="${offset}${expression}.offset"
  logtail -f "$1" -o $offset >${ftmp2}
  file2parse=${ftmp2}
elif [ ${minutes} -gt 1 ] ; then
  cat "$1" | lastmins ${minutes} >${ftmp2}
  file2parse=${ftmp2}
fi

if [ ! -z "$3" ] ; then
  ftmp3=`mktemp`
  grep $v -E "$3" ${file2parse} >${ftmp3}
  file2parse=${ftmp3}
fi

grep -oE "${2}" ${file2parse} | sort -o ${ftmp1}

total=`wc -l ${ftmp1} | awk '{print $1}'`
if [ $total -gt 0 ] ; then

  rows=`echo "${5}" | tr -cd '0-9'`
  [ -z "${rows}" ] && rows=5
  expression="${2}"

  [ -z "$3" ] || expression="${expression} && $3"
  # get double the amount of rows you want to see (2 * 5 = 10)
  uniq -c ${ftmp1} | sort -rn | head -n$((2 * ${rows})) >${ftmp2}
  # calculate the average value of these lines
  AVG=`cat ${ftmp2} | awk '{avg += $1}END{printf "%d\n", avg/NR}'`
  # Only show the rows that have more than average, but no more than the amount of lines you wanted (default = 5)
  head -n${rows} ${ftmp2} | awk "{ if (\$1 >= ${AVG}) print}"
  echo "Total: $total lines with \"${expression}\""
else
  echo '-'
fi

rm -f ${ftmp1} 2>/dev/null
rm -f ${ftmp2} 2>/dev/null
rm -f ${ftmp3} 2>/dev/null
 
Old 02-03-2012, 11:48 AM   #8
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Quote:
Originally Posted by fukawi1 View Post
or sed
Code:
sed '4,5!d' file
Where in this example, 4 is the start of the range (starting from zero, not 1), and 5 is the end range.
Sorry, no. All this does is specify a range of line numbers to operate on (in this case delete all except lines 4 and 5).

sed does not have the ability to do value comparisons on the contents, only pattern matching, like grep. For that you need awk or a similar tool.
 
Old 02-03-2012, 04:28 PM   #9
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian, Arch
Posts: 3,784

Rep: Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083Reputation: 2083
Code:
awk "{ if (\$1 > ${AVG}) print }"
awk has the -v option for this sort of thing:
Code:
awk -v AVG="$AVG" '{ if ($1 > AVG) print }'
# or just
awk -v AVG="$AVG" '($1 > AVG)'
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] w. bash open file, cut n char from begin of each line, wrt shortened lines to new fil DearWebby Programming 3 12-14-2010 01:28 AM
numerical operation on selected lines and column using AWK program vjramana Linux - Newbie 3 05-16-2010 11:43 PM
[SOLVED] Filtering out duplicate lines from a find/grep output thundervolt Linux - Newbie 10 03-25-2010 03:32 AM
LXer: Open Source professionals higher skills, higher paid: survey LXer Syndicated Linux News 0 03-11-2008 04:41 PM
logcheck not filtering out postfix/policy-spf lines vrillusions Linux - Software 1 04-12-2006 10:33 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:39 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration