LinuxQuestions.org
Visit Jeremy's Blog.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 11-02-2005, 08:02 PM   #1
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Rep: Reputation: 15
Shell script: substitute a file's content according to a map?


Now I want to substitute some strings to other strings in one file according to a map. For example, the map is:
Code:
A    STR1
B    STR2
C    STR3
...
and the file is:
Code:
......<$A>....<$C>....
...<$C>....<$B>.....$<A>
............<$B>......
Now I can only use shell script to get one line every time and do substitution:
Code:
_map_num=`wc -l $_map | cut -d" " -f1`
while [ $i -le $_map_num ]; do
   line=`sed -n "$i p" $_map`
   <Do substitution by $line>
   i=`expr $i + 1`
done
But I think this is not efficient because the sed will process the $_map file every time to get a line. Is there any way that I can just achieve the goal to process 2 files in 1 time just use awk? Or is there any other ways to make it more efficient?

Thanks.
 
Old 11-02-2005, 10:09 PM   #2
chrism01
Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Centos 6.5, Centos 5.10
Posts: 16,269

Rep: Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028Reputation: 2028
Here's a sed example that does every occurence of src string in all tgt files:
Code:
sed -i -e 's/Internalitem_code/internal_item_code/g' *.sql
HTH
 
Old 11-03-2005, 07:46 PM   #3
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Original Poster
Rep: Reputation: 15
look at this code:
Code:
_map_num=`wc -l $_map_file | cut -d" " -f1`
_map=`cat $_map_file`

if [ -z $_result ]; then _result=`cat $_tpl`; fi

i=1
while [ $i -le $_map_num ]; do
     line=`echo "$_map" | sed -n "$i p"`

 # echo "$_map" | while read line; do 
The "sed" can works but I think it's not efficient enough. But "read" takes no effect.

What's wrong?
 
Old 11-03-2005, 08:02 PM   #4
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Original Poster
Rep: Reputation: 15
More detail, I have written a script to generate SQL from a template and a string map(I will use it to generate SQL automatically and query MySQL database with "mysql" continously for testing in my project). So the next time, if I will test another database, I don't need to change the code, but give a different template and map.

The template like this:
Code:
BEGIN;
SET INSERT_ID=SELECT MAX(UID) FROM ACCSTORE<$NUM> + 1;
# SET INSERT_ID=<$INSERT_ID>;
INSERT INTO `ACCSTORE<$NUM>` ( ACCOUNT , PASSWD , STATE , TYPE) VALUES( '<$ACCOUNT>' , '<$PASSWD>' , '0' , '0');
INSERT INTO `BASEINFO<$NUM>` ( ACCOUNT , ADDRESS , BIRTH , CREATETIME , EMAIL , IDCARD , MPHONE , MPTYPE , NICKNAME , PHONE , POSTNUM , SEX , SUPERPASSWD , TNAME , TOKENRING , UID) VALUES( '<$ACCOUNT>' , '' , '' , '' , '<$NAME>@<$DOMAIN>' , '' , '' , '0' , '' , '' , '' , '0' , '<$NAME>' , '' , '<$MD5>' , 'SELECT MAX(UID) FROM ACCSOTRE<$NUM> + 1');
INSERT INTO `POINTBONUS<$NUM>` ( ACCOUNT , UID) VALUES( '<$ACCOUNT>' , 'SELECT MAX(UID) FROM ACCSTORE<$NUM> + 1');
COMMIT;
This database have 7 types tables, and every type have 31 tables. There is a hash program to determine which to insert.

and The map file like this:
Code:
NUM:            cmd(random_int 0 32 %04d)
ACCOUNT:        cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z - _ @ 1 2 3 4 5 6 7 8 9 0")
PASSWD:         cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 - = _ + @ % , . ; A B C D E J G H I J K L M N O P Q R S T U V W X Y Z")
NAME:           user
DOMAIN:         example.com.cn
MD5:            cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 - = _ + @ % , . ;" | md5sum | cut -d" " -f1)
STR:            cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z - _ @ 1 2 3 4 5 6 7 8 9 0 - = + % , . &#214;&#220; &#197;&#244; &#214;&#208; &#206;&#196; ; &#199;  &#210; &#194;&#204; &#203; &#210;&#194; &#208; &#231; &#207;&#184; &#211; &#178;&#187; &#208; &#185; A B C D E F G H I J K L M N O P Q R S T U V W X Y Z")
DATE:           cmd(date -d "`random_str 0 $RANDOM` days ago" -I)
NUSER:          new_user_name
NDOMAIN:        example.com
IDNUM:          cmd(expr $(random_int 0 1000000))
The script will looks up the map and finds the srings to substitute, if it is cmd(.*), then it will execute it as a shell command by "eval", and the result will be the string to replace. The script like this:
Code:
#!/bin/sh

# &#184;&#195;&#189;&#197;&#190;&#192;&#251;&#211;&#195;&#210;&#187;&#184;&#246; SQL &#196;&#163;&#230;&#186;&#205;&#210;&#187;&#184;&#246;&#213;&#235;&#182;&#212;&#184;&#195;&#196;&#163;&#230;&#181;&#196;&#211;&#179;&#201;&#228;&#201;&#179;&#201; SQL &#178;&#209;&#175;&#211;&#239;&#190;&#228;&#210;&#212;&#185;&#169;&#178;&#226;&#202;&#212;&#214;&#174;&#211;&#195;&#161;&#163;
# &#196;&#163;&#230;&#214;&#208;&#211;&#208;&#210;&#187;&#208;&#169; <$STR> &#208;&#206;&#202;&#189;&#181;&#196;&#180;&#174;&#163;&#172;&#205;&#185;&#253;&#212;&#218;&#211;&#179;&#201;&#228;&#206;&#196;&#188;&#254;&#214;&#208;&#178;&#213;&#210;&#207;&#211;&#166;&#181;&#196; STR &#182;&#248;&#181;&#195;&#181;&#189;&#204;&#230;&#187;&#187;&#161;&#163;
# &#213;&#226;&#192;&#239;&#192;&#251;&#211;&#195;&#193;&#203;&#180;&#194;&#235;&#201;&#179;&#201;&#198;&#181;&#196;&#212;&#173;&#192;&#161;&#163;

# &#214;&#220;&#197;&#244;, Chowroc at atgame, 20051102

_tpl=""
_map_file="map.txt"
_map=""
__result=""

# &#210;&#212;&#207;&#194;&#186;&#175;&#202;&#253;&#178;&#201;&#214;&#184;&#182;&#182;&#206;&#214;&#174;&#196;&#218;&#181;&#196;&#203;&#230;&#187;&#213;&#251;&#202;&#253;
random_int()  {
    start=$1
    range=`expr $2 - $1`
    if [ $# -eq 3 ]; then 
        format=$3; 
    else
        format="%d"
    fi

    num=`echo "" | awk "{srand(); print int(rand()*$range)+$start; }"`
    printf "$format" $num 
}

# &#192;&#251;&#211;&#195;&#210;&#212;&#207;&#194;&#186;&#175;&#202;&#253;&#178;&#201;&#210;&#187;&#184;&#246;&#203;&#230;&#187;&#181;&#196;&#228;&#179;&#214;&#251;&#180;&#174;
random_str()  {
    len="length()"
    if [ $# -eq 2 ]; then len=$2; fi
    echo $1 \
        | sed 's/ /\n/'g  \
        | while read L; do echo "$L $RANDOM"; done  \
        | sort -k2n  \
        | cut -d" " -f1  \
        | while read L; do echo -n $L; done  \
        | awk "{srand(); num=int(rand()*$len)+1; print substr(\$0,0,num)}"
        # | awk "{srand(); num=int(rand()*length())+1; print substr(\$0,0,num)}"
        # | awk '{srand(); num=int(rand()*length())+1; print substr($0,0,num)}'
    echo
}

usage()  {
    echo "usage: $0 -f template [-m map]"
}

while [ $# -ge 1 ]; do
    if [ $1 == "-f" ]; then
        _tpl=$2; 
        # &#200;&#231;&#185;&#251; $2 &#206;&#170;&#191;&#213;&#163;&#172;&#196;&#199;&#195;&#180;&#207;&#194;&#195;&#230;&#181;&#196; wc &#186;&#205; cat &#187;&#202;&#185;&#179;&#204;&#208;&#205;&#163;&#214;&#185;&#181;&#200;&#180;&#253;&#180;&#211;&#188;&#202;&#228;&#200;&#235;
        shift
    elif [ $1 == "-m" ]; then
        _map_file=$2;
        shift
    # elif [ $1 == "-" ]; then
    #   _result=``
    fi
    shift
done

# if [ $# -eq 0 ]; then
if [ -z $_tpl ] || [ -z $_map_file ]; then usage; exit 0; fi

if ! [ -f $_tpl ]; then
    echo "No such template file."
    exit 1
fi

if ! [ -f $_map_file ]; then
    echo "No such map file."
    exit 1
fi

_map_num=`wc -l $_map_file | cut -d" " -f1`
_map=`cat $_map_file`

if [ -z $_result ]; then _result=`cat $_tpl`; fi
# &#211;&#166;&#184;&#195;&#179;&#162;&#202;&#212;&#180;&#211;&#188;&#202;&#228;&#200;&#235;&#182;&#193;&#200;&#161;&#191;&#180;&#196;&#220;&#241;&#204;&#184;&#223;&#214;&#180;&#208;&#208;&#208;&#194;&#202;&#163;&#161;

i=1
while [ $i -le $_map_num ]; do
    # line=`sed -n "$i p" $_map_file` 
    line=`echo "$_map" | sed -n "$i p"`
    # &#195;&#191;&#180;&#206;&#182;&#193;&#200;&#235;&#210;&#187;&#208;&#208;&#211;&#179;&#201;&#228;

# echo "$_map" | while read line; do
# cat $_map_file | while read line; do
    # echo "$line"

    if echo $line | grep "cmd\(.*\)" >/dev/null 2>&1; then
        cmd=`echo $line | sed "s/^.*cmd(\(.*\))$/\1/g"` 
        src=`echo $line | awk -F: '{print $1}'`
        dst=`eval $cmd`
    else
        src=`echo $line | awk -F: '{print $1}'`
        dst=`echo $line | awk -F: '{print $2}' | sed 's/^ *//g'`
        # &#213;&#226;&#192;&#239;&#211;&#195; sed &#200;&#165;&#179;&#253;&#208;&#208;&#202;&#181;&#196;&#191;&#213;&#184;&#241;&#161;&#163;&#184;&#186;&#195;&#181;&#196;㨬&#163;&#191;
    fi
    # &#184;&#190;&#221;&#199;&#191;&#246;&#163;&#172;&#200;&#231;&#185;&#251;&#211;&#208; cmd(.*)&#163;&#172;&#212;&#189;&#171;&#198;&#228;&#206;&#170; shell &#195;&#193;&#238;&#214;&#180;&#208;&#208;&#178;&#162;&#189;&#171;&#181;&#187;&#216;&#189;&#185;&#251;&#206;&#170;&#204;&#230;&#187;&#187;&#180;&#174;
    # echo "<\$$src> --> $dst"
    echo "<\$$src>"
    echo $dst
    # _result=`echo "$_result" | sed "s/<\$$src>/$dst/g"`
    _result=$(echo "$_result" | sed "s/<\$$src>/$dst/g")
    # &#189;&#248;&#208;&#208;&#204;&#230;&#187;&#187;
    i=`expr $i + 1`
done

echo "$_result"
Any suggestion? I have said that now the problem is "sed" is not efficient, but "read" takes no effect.

Thanks
 
Old 11-04-2005, 03:20 AM   #5
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
I like it great idea.

I think maybe you are stretching the limits of shell programming here.

When you are starting to use lots of pipes to sed, awk, cut, grep
it starts getting very inefficient and messy.

maybe you should consider using perl?

OR (not so good but did you know?)

You can use m4 for simple substitution.
maybe first pass substitutions second pass evals?

you can substitue with m4 command line like so:
Code:
m4 -DA=STR1 -DB=STR2 -DC=STR3  template
or put the definitions in a file.
 
Old 11-04-2005, 03:26 AM   #6
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
suggestion:

how about getting random words from a dictionary
( /usr/dict/words ? ) instead of just random letters?
 
Old 11-06-2005, 05:09 AM   #7
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Original Poster
Rep: Reputation: 15
Before I have tried to get random words from /usr/share/dict/linux.words, also slow for the same reason.

Now it is about generating 1 SQL transaction per sec(90s~2min for 105 transactions). I tested using "while read", It could only make it faster about 10~20 seconds, and if I use:
echo "$_result"
in the "while read",
I could get the right result every cycle, but I don't know how to get it from the circulation with high performance!

And I think anothor problem is that: it must assign the _result variable every time, and this make the low efficiency.

Maybe I will rebuild it with python, I'm learning that. But before that, I want to try to test the concurrency: If I could run 20 copies of that script at the same time, what about the result?

Thank's for you help. :-)

Last edited by Chowroc; 11-06-2005 at 05:26 AM.
 
Old 11-07-2005, 05:14 AM   #8
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
Quote:
Before I have tried to get random words from /usr/share/dict/linux.words, also slow for the same reason.
you could always try converting it to a dbm database if you have time, it's very simple and very quick.
 
Old 11-07-2005, 07:41 AM   #9
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Original Poster
Rep: Reputation: 15
Quote:
Originally posted by bigearsbilly
you could always try converting it to a dbm database if you have time, it's very simple and very quick.
Could you give me some details?

I have tried the concurrency, and no effect. Even I write a simple C program that fork child processes to do the task!

In fact, I found that if I fork 5 child processes, The program will wait about 5 seconds, and 20 seconds for 20 childs. The execution becames not averagely, and the total time was near to the results before.

I don't know the exact reason.

Thank you very much.
 
Old 11-08-2005, 03:56 AM   #10
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
example? for /usr/dict/words into DBM database?

I have some dbm code somewhere, It could be changed to make a simple
random word generator. Is this what you mean?
 
Old 11-14-2005, 08:45 PM   #11
Chowroc
Member
 
Registered: Dec 2004
Posts: 145

Original Poster
Rep: Reputation: 15
Quote:
Originally posted by bigearsbilly
example? for /usr/dict/words into DBM database?

I have some dbm code somewhere, It could be changed to make a simple
random word generator. Is this what you mean?
Yes, that what I mean. thank you.
 
Old 11-15-2005, 04:08 AM   #12
bigearsbilly
Senior Member
 
Registered: Mar 2004
Location: england
Distribution: FreeBSD, Debian, Mint, Puppy
Posts: 3,287

Rep: Reputation: 173Reputation: 173
ok I will dig it out.
it's at home so laters.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
how to clear a file's content? iclinux Linux - Newbie 7 03-30-2009 02:01 PM
shell script problem, want to use shell script auto update IP~! singying304 Programming 4 11-29-2005 05:32 PM
Keyboard map used by X is different to that used in shell ? Raptor Ramjet Slackware 5 10-13-2003 10:23 AM
How to modify a file's attribute in shell? Xiangbuilder Linux - Newbie 1 09-06-2003 06:18 AM
Perl script: how to substitute through entire document? mister_math Programming 2 03-07-2003 11:38 PM


All times are GMT -5. The time now is 11:36 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration