Hi.
The awk language can often be more understandable than perl.
Here is a timing of your code, the code with perl in it, and an awk script. The data file is 1000 lines, and was randomly generated so that the mean of the first column is 107; the second column contains the line number. So we expect about 500 to be more than 107.1:
Code:
% ./doit
Data file contains number of lines: 1000
User code 1, expr, expect wrong answers:
Lines greater than 107.1 = 959
real 0m0.849s
user 0m0.291s
sys 0m0.515s
User code 2, perl:
Lines greater than 107.1 = 503
real 0m2.047s
user 0m0.979s
sys 0m0.981s
Code 3, gawk:
Lines greater than 107.1 = 503
real 0m0.005s
user 0m0.003s
sys 0m0.002s
The awk script is:
Code:
#!/bin/sh
rm r3
gawk '
{ lines++ }
$1 < 107.1 { hits++; print $2 >> "r3" }
END { print " Lines greater than 107.1 =", lines-hits }
' r2
Using shell constructs for large numbers of comparisons is slow, and only works for integers (zsh might be different). Adding a load of the perl interpreter for each comparison is very expensive.
Briefly, the awk script says that for each line, increase the variable "lines" by 1, if the first value on a line is less than 107.1, then append the second field to file r3, and at the end-of-file of r2, print the count.
Learning a language like awk can save you a lot of time, and is likely to be more accessible than perl. (Once you get used to awk you can convert awk scripts to perl automatically if you desire.) ... cheers, makyo