Here's your program, dunno if you're gonna use it, but it was a nice exercise for me, as I'm not particularly good at C coding:
PHP Code:
// calculates min and max for input file
// not much input checking is done, so use this program wisely, don't throw it crap or it will give you crap
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
FILE * file;
int num, max, min;
// make sure we have only one input parameter, this includes the name of the program run, thus 2
if ( 2 != argc )
{
fputs ("ERROR: Need exactly 1 argument !\n",stderr);
printf("Usage:\t minmax file\n");
exit(1);
}
// open file for reading
file = fopen ( argv[1] , "r" );
if ( file == NULL ) { fputs ("ERROR: failed to open file !\n",stderr); exit (1); }
// get num from file
fscanf (file, "%d", &num);
// set starting values
min = max = num;
// read numbers until EOF, setting min and max as we go
while ( !feof(file) )
{
if ( num < min )
{
min = num;
}
else if ( num > max )
{
max = num;
}
fscanf (file, "%d", &num);
}
// print min and max
printf("min=\t%d\n", min);
printf("max=\t%d\n", max);
// close file
fclose (file);
return 0;
}
Code:
bash-3.1$ for i in $(seq 100000); do echo $RANDOM >> file; done
bash-3.1$ time awk 'NR==1{
> tempmin=$0
> tempmax=$0
> }
> $0 >= tempmax{ tempmax=$0 }
> $0 <= tempmin { tempmin = $0 }
> END{
> print "min: "tempmin
> print "max: "tempmax
> }' file
min: 0
max: 32767
real 0m0.081s
user 0m0.080s
sys 0m0.001s
bash-3.1$ time sort -n file > sorted && head -n1 sorted && tail -n1 sorted
real 0m0.129s
user 0m0.083s
sys 0m0.004s
0
32767
bash-3.1$ gcc minmax.c -o minmax
bash-3.1$ time ./minmax file
min= 0
max= 32767
real 0m0.023s
user 0m0.022s
sys 0m0.001s
# and if you wanted to be a real geek like me, proceed:
bash-3.1$ echo $CFLAGS
-march=nocona -O2 -pipe -fPIC
bash-3.1$ gcc minmax.c -march=nocona -O2 -pipe -fPIC -o minmax
bash-3.1$ strip --strip-unneeded minmax
bash-3.1$ time ./minmax file
min= 0
max= 32767
real 0m0.020s
user 0m0.019s
sys 0m0.001s
More benchmarking with higher numbers:
Code:
bash-3.1$ for i in $(seq 1000000); do echo $RANDOM >> file; done
bash-3.1$ time awk 'NR==1{
tempmin=$0
tempmax=$0
}
$0 >= tempmax{ tempmax=$0 }
$0 <= tempmin { tempmin = $0 }
END{
print "min: "tempmin
print "max: "tempmax
}' file
min: 0
max: 32767
real 0m0.812s
user 0m0.806s
sys 0m0.006s
bash-3.1$ time sort -n file > sorted && head -n1 sorted && tail -n1 sorted
real 0m1.156s
user 0m1.124s
sys 0m0.032s
0
32767
bash-3.1$ time ./minmax file
min= 0
max= 32767
real 0m0.194s
user 0m0.187s
sys 0m0.006s
Code:
bash-3.1$ for i in $(seq 10000000); do echo $RANDOM >> file; done
bash-3.1$ time awk 'NR==1{
tempmin=$0
tempmax=$0
}
$0 >= tempmax{ tempmax=$0 }
$0 <= tempmin { tempmin = $0 }
END{
print "min: "tempmin
print "max: "tempmax
}' file
min: 0
max: 32767
real 0m8.320s
user 0m8.265s
sys 0m0.053s
bash-3.1$ time sort -n file > sorted && head -n1 sorted && tail -n1 sorted
real 0m16.333s
user 0m14.540s
sys 0m0.284s
0
32767
bash-3.1$ time ./minmax file
min= 0
max= 32767
real 0m1.889s
user 0m1.836s
sys 0m0.052s
as you can see the differences become quite large with larger amounts of data, I didn't do it with more data because it was taking a while ...