Thanks mRgOBLIN !
I really do like gawk's syntax. It's quite capable of any text processing that I have ever thrown at it.
I did figure out why get-pkgsize was so much faster than audriusk's original awk one-liner ...
The get-pkgsize script invokes nextfile as soon as it finds the /^UNCOMPRESSED/ pattern in the stream
OTOH, the awk one-liner scans the rest of each the files in /var/log/packages, looking for more instances
of the pattern in each line.
These are the times again. Note that the new one-liner with nextfile is about as fast as the grep pipeline.
-- kjh
p.s. not that the times of 0.02 secs vs 0.14 secs mean anything in the real world where the one-liner will be invoked
Code:
# gnashley's fgrep suggestion:
[konrad@kjhlt5 compat32pkg]$ time fgrep UNCOMPRESSED /var/log/packages/* | awk -F: '{print $3,$1}' | LC_ALL=C sort -rh > /dev/null
real 0m0.020s
user 0m0.009s
sys 0m0.017s
# fskmh's grep approach with audriusk's LC_ALL=C and sort -rh
[konrad@kjhlt5 compat32pkg]$ time grep UNCOMPRESSED /var/log/packages/* | awk -F: '{print $3,$1}' | LC_ALL=C sort -rh > /dev/null
real 0m0.022s
user 0m0.015s
sys 0m0.016s
# audriusk's original awk one-liner
[konrad@kjhlt5 compat32pkg]$ time awk -F: '/UNCOMPRESSED/ {print $2,FILENAME}' /var/log/packages/* | LC_ALL=C sort -rh > /dev/null
real 0m0.137s
user 0m0.131s
sys 0m0.014s
# audriusk's original awk one-liner with nextfile after match
[konrad@kjhlt5 compat32pkg]$ time awk -F: '/^UNCOMPRESSED/ {print $2,FILENAME ; nextfile }' /var/log/packages/* | LC_ALL=C sort -rh > /dev/null
real 0m0.022s
user 0m0.011s
sys 0m0.016s
# get-pkgsize (also invokes nextfile )
[konrad@kjhlt5 compat32pkg]$ time get-pkgsize > /dev/null
real 0m0.040s
user 0m0.028s
sys 0m0.019s
# since we're being anal about runtimes <G>, here's grep with the -m1 arg:
[konrad@kjhlt5 compat32pkg]$ time grep -m1 UNCOMPRESSED /var/log/packages/* | awk -F: '{print $3,$1}' | LC_ALL=C sort -rh > /dev/null
real 0m0.014s
user 0m0.005s
sys 0m0.015s
# and here's fgrep with the same -m1 flag
[konrad@kjhlt5 compat32pkg]$ time fgrep -m1 UNCOMPRESSED /var/log/packages/* | awk -F: '{print $3,$1}' | LC_ALL=C sort -rh > /dev/null
real 0m0.018s
user 0m0.009s
sys 0m0.022s