LinuxQuestions.org - number of lines to the new file

Hi.

If you have a really long file, you may want to consider some optimization. I don't have any really large files, but here are results on working with a file that is around 1 GB. I assume that for a match, you want only the first hit. I have adjusted the requirement form 10 to 2 to save posting space:

Code:

#!/bin/bash -



# @(#) s1      Demonstrate obtaining a segment, piece, part of a file.



echo

echo "(Versions displayed with local utility \"version\")"

version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) sed grep cgrep

set -o nounset

echo



FILE=${1-/tmp/test-one-gb}



echo "  Lines in data file $FILE:"

time wc -l $FILE



echo

echo " Results, sed:"

time sed -n '434,435 p' $FILE



echo

echo " Results, sed with quit:"

time sed -n -e '434,435 p' -e '436 q' $FILE



echo

echo " Results, grep, max-count:"

time grep --max-count=1 -A 1 -n "nightmare" $FILE



echo

echo " Results, cgrep, -N matches:"

echo " http://www.bell-labs.com/project/wwexptools/cgrep/"

time cgrep -N 1 +1 -n -D "nightmare" $FILE



exit 0

Code:

$ ./s1



(Versions displayed with local utility "version")

OS, ker|rel, machine: Linux, 2.6.11-x1, i686

Distribution        : Xandros Desktop 3.0.3 Business

GNU bash 2.05b.0

GNU sed version 4.1.2

grep (GNU grep) 2.5.1

cgrep - (local: ~/executable/cgrep Sep 28 2007 )



  Lines in data file /tmp/test-one-gb:

14754910 /tmp/test-one-gb



real    0m19.423s

user    0m1.049s

sys    0m1.645s



 Results, sed:

nightmare to a dead sartainty.  Landlord, I whispered, that aint the

harpooneer, is it?  Oh, no, said he, looking a sort of diabolically funny,



real    0m5.926s

user    0m5.020s

sys    0m0.807s



 Results, sed with quit:

nightmare to a dead sartainty.  Landlord, I whispered, that aint the

harpooneer, is it?  Oh, no, said he, looking a sort of diabolically funny,



real    0m0.001s

user    0m0.002s

sys    0m0.000s



 Results, grep, max-count:

434:nightmare to a dead sartainty.  Landlord, I whispered, that aint the

435-harpooneer, is it?  Oh, no, said he, looking a sort of diabolically funny,



real    0m0.001s

user    0m0.000s

sys    0m0.001s



 Results, cgrep, -N matches:

 http://www.bell-labs.com/project/wwexptools/cgrep/

434:nightmare to a dead sartainty.  Landlord, I whispered, that aint the

435:harpooneer, is it?  Oh, no, said he, looking a sort of diabolically funny,



real    0m0.002s

user    0m0.001s

sys    0m0.002s

Note that without the "quit", sed will go through the entire file, whereas it's much faster with the "quit". If you want to do matching, then the GNU grep has a feature to stop at "n" hits. If you don't have GNU grep, then one can obtain cgrep, which has similar features (and much more), from the site noted ... cheers, makyo