bash help

brian0918 · 06-24-2003, 11:54 AM

I have a file that may contain information such as this:

99 0.00012
100 0.00011
101 0.00010
102 0.00010
103 0.00009
104 0.00007

--------------------------------------------------------------------------------------------------------

sample text

more text

Besides this little chunk, the file contains tons of other information, and multiple iterations (some form of this chunk occurs multiple times in the file). What I need to do is, once the second column goes below 0.0001, I need the number in the first column (so in this case, I need to somehow grab the number 103)

NOTE: The number "0.00010" may appear more than once in any iteration, such as above, or it may not appear at all (skip from 0.00011 to 0.00009 or something)

In order to get near this info, I figured I should do something like:

grep "sample text" FILENAME -B 100

so that I could grab the 100 lines before each occurrence of "sample text" (which only occurs after these columns of numbers).

But, as far as grabbing the lines below .0001, I'm stumped.

Thanks.

jvannucci · 06-24-2003, 12:26 PM

Use perl.:

# perl -ne'/^\s*(\d+)\s(\d+\.\d+)/ and $2<0.0001 and print "$1\n";' FILENAME

You could use awk in a similar fashion.

brian0918 · 06-24-2003, 02:03 PM

Thanks. How would I go about doing it with awk?

Crashed_Again · 06-24-2003, 02:05 PM

Hey jvannucci, that command really scares me. Its very intimidating.

brian0918 · 06-24-2003, 02:14 PM

The perl command doesn't seem to be working. Could someone please explain what the various characters mean?

Thanks.

brian0918 · 06-24-2003, 02:30 PM

Ok, using a command similar to the one below....

grep "sample text" FILENAME -B 50 | grep "sample text" -v | grep "-" -v | awk '{if ($2 < 0.0001) print $1}' > newfile

I was able to get something like this in newfile:

123
124
125
126
127
128
129
130

124
125
126
127
128
129
130

122
123
124
125
126
127
128
129
130

and so on and so forth....

How can I just grab the first line from each of these groups of numbers and put it into a file?

Thanks.

new_user10 · 06-24-2003, 02:36 PM

it's a regular expression matching pattern. you could try writing a short perl script to do it. here goes:

1) open a text file in an editor and name it parse.pl (or whatever)
2) type this:

#!/usr/bin/perl (or wherever perl is - do "whereis perl" to find it)

open(FILE, "file_path.txt"); (or whatever file path is)

while (<FILE>) {
if (/(\d*) (\d\.\d*)/) {
if ($2 < 0.0001) {
print "$1\n";
}
}
}

3) Save this file.
4) At terminal, type:

perl parse.pl

5) This will then print all the numbers (103, 104, etc) whose second number is less than 0.0001.

I hope that helps. I don't really understand the part about "grab the 100 lines before each occurence of 'sample text'"

brian0918 · 06-24-2003, 02:42 PM

But, the only number I want is the first one that is below .0001 in each group of numbers. See my previous message.

new_user10 · 06-24-2003, 03:10 PM

you could use a switch variable--for example,

$switch = 0;

...same code...
if ($2 < 0.0001 && $switch == 0) {
print "$1\n";
$switch = 1;
}

then reset the switch to 0 at the end of every block

brian0918 · 06-24-2003, 03:16 PM

Thanks