LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 02-14-2013, 09:15 PM   #31
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,540

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878

Quote:
Originally Posted by ntubski View Post
More improvement is possible by using the right algorithm, ...
I can't quite figure out what trs does because all the comments and names in the source are Polish.
I did some tests and it looks like trs isn't using a fast algorithm. trs is faster than sed, but only by a constant factor. For both methods, doubling input quadruples runtime.

Since it appears that there is no good solution for this, I think I'll try writing one using MultiFast: Aho-Corasick C Library.

Code:
~/tmp/replacing$ for size in 10 100 1000 2000 ; do ./replace.sh sed $size ; done
0.01 real	0.00 user	-	sed (10) 
0.05 real	0.04 user	-	sed (100) 
2.93 real	2.86 user	-	sed (1000) 
10.67 real	10.57 user	-	sed (2000) 
~/tmp/replacing$ for size in 10 100 1000 2000 4000 8000 ; do ./replace.sh trs $size ; done
0.00 real	0.00 user	-	trs (10) 
0.01 real	0.00 user	-	trs (100) 
0.24 real	0.23 user	-	trs (1000) 
0.83 real	0.82 user	-	trs (2000) 
3.15 real	3.12 user	-	trs (4000) 
12.44 real	12.34 user	-	trs (8000)
Code:
#!/bin/bash
# replace.sh

method=$1
size=$2

TIMEFORMAT=$'%2R real\t%2U user\t-\t'"$method ($size) "

case "$method" in
    sed)
        time sed -r 's|(^.*) (.*)|s/\1\/\2/g|' "$size-replacements.txt" | \
            sed -f - "$size-unreplaced.txt" > /dev/null
        ;;

    trs)
        time trs -f "$size-replacements.txt" < "$size-unreplaced.txt" > /dev/null
        ;;

    *) echo "I don't know how to do $method" ; exit 1
        ;;
esac
Code:
#!/bin/sh

# Some test inputs: replace a bunch numbers to xxxxxxx. The number of
# replacement pairs is equal to the size of input, so a naive
# algorithm will be show quadratic running times.

makeit()
{
    seq -f'%09.0f xxxxxxxxx' $1 > "$1-replacements.txt"
    seq -f'%09.0f' $1 > "$1-unreplaced.txt"
}

rm *-replacements.txt *-unreplaced.txt

makeit 10
makeit 100
makeit 1000
makeit 2000
makeit 4000
makeit 8000
 
Old 02-16-2013, 01:25 PM   #32
ntubski
Senior Member
 
Registered: Nov 2005
Distribution: Debian
Posts: 2,540

Rep: Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878Reputation: 878
fast-replace shows linear run time:
Code:
~/tmp/replacing$ for size in 1000 2000 4000 8000 16000 32000 ; do ./replace.sh fast-replace $size ; done
0.01 real	0.00 user	-	fast-replace (1000) 
0.02 real	0.01 user	-	fast-replace (2000) 
0.03 real	0.02 user	-	fast-replace (4000) 
0.06 real	0.04 user	-	fast-replace (8000) 
0.10 real	0.08 user	-	fast-replace (16000) 
0.20 real	0.17 user	-	fast-replace (32000)
Added case to replace.sh:
Code:
    fast-replace)
        time ./fast-replace "$size-replacements.txt" < "$size-unreplaced.txt" > /dev/null
        ;;
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk - field substitutions gafoleyo Linux - Newbie 12 05-13-2012 05:29 PM
code substitutions Loarn Programming 2 07-14-2011 07:07 PM
string substitutions within a file cleopard Programming 1 09-05-2008 04:52 PM
variables within sed substitutions? ocicat Programming 3 07-29-2007 01:17 PM
Perl: Using Vars in Substitutions cramer Programming 6 08-26-2006 01:52 PM


All times are GMT -5. The time now is 09:42 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration