LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-23-2010, 10:57 AM   #1
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Rep: Reputation: 19
find the total of numbers that are higher than x in a text file with numbers (using awk??)


Hi there,

I have a file: list.txt that contains this:

Code:
1
2
3
5
4
6
9
8
2
1
3
6
4
7
9
and want to count the total of numbers that are higher than x. If x is 7 than the answer in my example above would be 3. FYI: My actual files have hundreds of values in one column and also contain decimals.

I can imagine that awk may do the trick, but could not find it in my regular references:
http://www.grymoire.com/Unix/Awk.html
http://www.gnu.org/software/gawk/manual/gawk.html

Your help is appreciated a lot!

Last edited by Mike_V; 11-23-2010 at 11:00 AM.
 
Old 11-23-2010, 11:07 AM   #2
dugan
Senior Member
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 4,758

Rep: Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465
Code:
awk '{if ($1>7) print $1}' input.txt | wc -l

Last edited by dugan; 11-23-2010 at 11:20 AM.
 
1 members found this post helpful.
Old 11-23-2010, 11:23 AM   #3
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Original Poster
Rep: Reputation: 19
Thanks dugan. It indeed works for the sample data.

There are two issues:

1.
If I try your line and use not 8-9 but 8-10, it doesn't work. Even if I add 10 and 11 to my list (is it limited to 1 digit?)

2.
More importantly, as you also acknowledged, my real data is more complex.
Here is a sample if the read data:

Code:
0.0820013
0.0294894
0.0269461
0.0327966
0.0877525
0.0385039
0.0271613
0.0284816
0.0623967
0.0427087
and I would like to know how many are above, say, 0.05.

Thanks!

Last edited by Mike_V; 11-23-2010 at 11:24 AM.
 
Old 11-23-2010, 11:26 AM   #4
dugan
Senior Member
 
Registered: Nov 2003
Location: Canada
Distribution: distro hopper
Posts: 4,758

Rep: Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465Reputation: 1465
I edited my post to give you a better answer.

I don't mind providing a single line to help with homework, but may I ask what the real-world problem was?

The original solution, btw, was indeed limited to one digit integers:

Code:
cat input.txt | egrep '^[8-9]$' | wc -l
I recommend learning enough about regular expressions to understand why.

Last edited by dugan; 11-23-2010 at 11:33 AM.
 
1 members found this post helpful.
Old 11-23-2010, 12:01 PM   #5
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Original Poster
Rep: Reputation: 19
your edited first post indeed does the trick! Very nice. Thanks Dugan!

My real world problem: People were lying in an MRI scanner to measure changes in neuronal activity over time. During 10 minutes their brain is measured every 2.5 seconds (one measure is called one volume in 3D, there are 240 volumes, creating a 4D dataset). People are instructed to lie as still as possible but regardless people move more or less. We perform rigid body motion correction on each volume (fitting each volume to the first volume and storing this as a new 4D dataset). The file above is the relative motion correction in 3-D space (so in x-y-z direction) in millimeters. And with relative I mean changes from one volume to the next, and not absolute (change from the first volume). I want to know how many times a person has moved more than .5 mm. That's what your one liner is going to do... I have to do this for a couple of hundred subjects. So my life just got a lot easier, thanks! I'm a psychologist working with brain data... in the process I've learned some programming, but I should indeed learn a bit more about regular expressions.

Last edited by Mike_V; 11-23-2010 at 12:06 PM.
 
Old 11-23-2010, 01:27 PM   #6
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269
You can also do it completely in awk:

Code:
awk '{ if ( $1 > 0.05 ) num++ }END{ print num }' test.txt
Awk is neat because it has C-like and sometimes C compatible syntax (printf). Great for working with tables of data, and with floating point arithmetic.
 
1 members found this post helpful.
Old 11-23-2010, 03:10 PM   #7
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Original Poster
Rep: Reputation: 19
H_TeXMeX_H: also big thanks! This is even easier to combine in an awk one-liner with some other stats that I need to extract.
 
Old 11-23-2010, 05:29 PM   #8
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,561

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
Probably not of great value here but you can also use awk to condense things like this (at the expense of readability):
Code:
awk '{c[($1>0.5)]++}END{print c[1]}' file
 
1 members found this post helpful.
Old 11-23-2010, 07:57 PM   #9
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Original Poster
Rep: Reputation: 19
One more additional question (and it's not crucial, but it would be nice to solve). If I run this one:

Code:
awk '{ if ( $1 > 0.05 ) num++ }END{ print num }' test.txt
and for one file there is not a single number larger than 0.05, the output will be empty (=nothing).

Is it easy to output a zero in that case (how do the "if then else" rules work in awk??)
 
Old 11-23-2010, 09:20 PM   #10
barriehie
Member
 
Registered: Nov 2010
Distribution: Debian Lenny
Posts: 136
Blog Entries: 1

Rep: Reputation: 23
@ Mike_V; In regards to learning a bit more about regular expressions, regexp, this got me started. http://www.regular-expressions.info/tutorial.html
 
Old 11-23-2010, 09:23 PM   #11
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,561

Rep: Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939Reputation: 1939
Code:
BEGIN{c=0}
or
Code:
END{if(c)print c; else print 0}

Last edited by grail; 11-23-2010 at 09:25 PM.
 
Old 11-24-2010, 02:52 AM   #12
H_TeXMeX_H
Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269Reputation: 1269
Quote:
Originally Posted by Mike_V View Post
One more additional question (and it's not crucial, but it would be nice to solve). If I run this one:

Code:
awk '{ if ( $1 > 0.05 ) num++ }END{ print num }' test.txt
and for one file there is not a single number larger than 0.05, the output will be empty (=nothing).

Is it easy to output a zero in that case (how do the "if then else" rules work in awk??)
It's true in that case case it outputs nothing.

Yes, C-like syntax = if else clauses:

Code:
awk '{ if ( $1 > 0.05 ) num++; else num=0 }END{ print num }' test.txt
or like grail suggests you can initialize it yourself for safety (I usually do anyway to avoid stuff like this):

Code:
awk 'BEGIN{num=0}{ if ( $1 > 0.05 ) num++ }END{ print num }' test.txt

Last edited by H_TeXMeX_H; 11-24-2010 at 02:55 AM.
 
Old 11-24-2010, 09:51 AM   #13
Mike_V
Member
 
Registered: Apr 2009
Location: Boston MA
Distribution: CentOS 6.2 x86_64 GNU/Linux
Posts: 59

Original Poster
Rep: Reputation: 19
excellent! thanks again
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Reading numbers from text file and storing in array idaham Linux - General 3 05-27-2010 03:36 AM
Shell script:- Reading numbers embedded in brackets from a text file rsan Linux - Newbie 6 07-05-2009 06:01 AM
Need a script to find/replace numbers with names in 1 file using another as the guide kmkocot Programming 4 07-03-2009 03:30 AM
Total Numbers James L. Westrich General 3 12-03-2006 05:22 PM
Printing numbers from a text file dynamically mrobertson Programming 1 06-28-2005 08:19 AM


All times are GMT -5. The time now is 08:19 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration