Shell script: substitute a file's content according to a map?
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
Now I can only use shell script to get one line every time and do substitution:
_map_num=`wc -l $_map | cut -d" " -f1`
while [ $i -le $_map_num ]; do
line=`sed -n "$i p" $_map`
<Do substitution by $line>
i=`expr $i + 1`
But I think this is not efficient because the sed will process the $_map file every time to get a line. Is there any way that I can just achieve the goal to process 2 files in 1 time just use awk? Or is there any other ways to make it more efficient?
More detail, I have written a script to generate SQL from a template and a string map(I will use it to generate SQL automatically and query MySQL database with "mysql" continously for testing in my project). So the next time, if I will test another database, I don't need to change the code, but give a different template and map.
This database have 7 types tables, and every type have 31 tables. There is a hash program to determine which to insert.
and The map file like this:
NUM: cmd(random_int 0 32 %04d)
ACCOUNT: cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z - _ @ 1 2 3 4 5 6 7 8 9 0")
PASSWD: cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 - = _ + @ % , . ; A B C D E J G H I J K L M N O P Q R S T U V W X Y Z")
MD5: cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z 1 2 3 4 5 6 7 8 9 0 - = _ + @ % , . ;" | md5sum | cut -d" " -f1)
STR: cmd(random_str "a b c d e f g h i j k l m n o p q r s t u v w x y z - _ @ 1 2 3 4 5 6 7 8 9 0 - = + % , . ÖÜ Åô ÖÐ ÎÄ ; Ç¨¤ ¨®¨¨ ¨®Ò ÂÌ Ë¨° ÒÂ Ð¡À ¡¤ç Ï¸ Ó¨º ²» Ð¨¨ ¹¨¦ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z")
DATE: cmd(date -d "`random_str 0 $RANDOM` days ago" -I)
IDNUM: cmd(expr $(random_int 0 1000000))
The script will looks up the map and finds the srings to substitute, if it is cmd(.*), then it will execute it as a shell command by "eval", and the result will be the string to replace. The script like this:
Before I have tried to get random words from /usr/share/dict/linux.words, also slow for the same reason.
Now it is about generating 1 SQL transaction per sec(90s~2min for 105 transactions). I tested using "while read", It could only make it faster about 10~20 seconds, and if I use:
in the "while read", I could get the right result every cycle, but I don't know how to get it from the circulation with high performance!
And I think anothor problem is that: it must assign the _result variable every time, and this make the low efficiency.
Maybe I will rebuild it with python, I'm learning that. But before that, I want to try to test the concurrency: If I could run 20 copies of that script at the same time, what about the result?
Originally posted by bigearsbilly you could always try converting it to a dbm database if you have time, it's very simple and very quick.
Could you give me some details?
I have tried the concurrency, and no effect. Even I write a simple C program that fork child processes to do the task!
In fact, I found that if I fork 5 child processes, The program will wait about 5 seconds, and 20 seconds for 20 childs. The execution becames not averagely, and the total time was near to the results before.