LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)
-   -   awk, sed find and replace recursively from files (https://www.linuxquestions.org/questions/linux-newbie-8/awk-sed-find-and-replace-recursively-from-files-788712/)

bluewind 02-12-2010 11:52 AM

awk, sed find and replace recursively from files
 
Hi all,

I am new to linux as well as awk, grep or sed.
I need a find and replace command single liner or script that loops trough input file (file1) and find the particular input in file2 and add "!" in front of the found string.

Example:

input file: file1

g+h=o+p
a+b=c+d


file2 (file that need to look for)

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf


Output file (file3 should look like this)

!a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf


I have tried many awk and sed method of find and replce but it did not work the way I wanted. This is mainly due to my lack of experience in awk and sed.

The program should loop trough file1 and find in file2 and output in file3 for the 1st (g+h=o+p) set then repeat the same process again for set 2 (a+b=c+d).

Thaks in advance. I really appreciate any suggestion...

rweaver 02-12-2010 12:00 PM

Awk would be more suited to this particular task than Sed in my opinion, however are you limited to those applications or can you use something like a bash script or perl also?

jschiwal 02-12-2010 12:04 PM

Look at grep's -f option to locate the files containing patterns listed in a pattern file. This will generate a list of files to be edited.
This looks like a homework question. You should provide what you have tried if you want further help.

pixellany 02-12-2010 12:07 PM

This looks like a homework assignment...

Please show us the code that you have tried---also, what books or reference materials are you using?

Here's a big hint:

Code:

while read line; do
  sed "/$line/s<<fill in the details>>" file2
done < file1

This reads one line at a time from file 1 and assigns it to the variable "line". It then looks for the content of that variable in file2. If it is found, then it invokes the sed "s" command. (You will fill in the actual code where is have <<fill in the details>>.

Test first with just this---when it is working, then you can add another re-direction operator to write to file 3.

pixellany 02-12-2010 12:08 PM

"HOMEWORK LOCK"

There's no specific rule about this, but I am **requesting** that noone simply supply the answer--at least until the OP responds.

bluewind 02-12-2010 12:34 PM

Hi all,

I tried this command but did not work one of the method i tried is:

diff file1 file2 | sed '/^[0-9][0-9]*/d; s/^. //; /^---$/d' > file3

here I created file1 the input file
!g+h=o+p
!a+b=c+d‎‪

the file2 (the looked for file)

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf

This gives combination of file 1 and 2 like this:output file3

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf
!g+h=o+p
!a+b=c+d‎‪


At the moment I am using the reference from this forum and this links

http://www.ibm.com/developerworks/li...ry/l-sed2.html
http://forums.devshed.com/unix-help-...-a-146179.html
http://www.brunolinux.com/02-The_Ter..._with_Sed.html
http://www.gnu.org/software/gawk/man...ml#Very-Simple
http://www.grymoire.com/Unix/Sed.html#uh-8

I will try the hint suggested by pixellany.And post updates asap. Ill take some time to test the suggested hints because I am really new to this and takes time to understand the command itself.

Thank you in advance

bluewind 02-12-2010 12:36 PM

Quote:

Originally Posted by rweaver (Post 3861669)
Awk would be more suited to this particular task than Sed in my opinion, however are you limited to those applications or can you use something like a bash script or perl also?


I can use bash script, I have never tried perl before.

Thanks

jschiwal 02-12-2010 01:05 PM

Another approach could be to use sed on file1 to create the sed script to use on the file(s).

You haven't approached how to locate the file or files needing to be edited.

bluewind 02-12-2010 11:54 PM

Quote:

Originally Posted by jschiwal (Post 3861747)
Another approach could be to use sed on file1 to create the sed script to use on the file(s).

You haven't approached how to locate the file or files needing to be edited.


I am not sure how to use sed to create sed script, Ill find out soon : )

The files need to be edited i know the location and the file name are this sufficient? I dont have hundreds of files to be edited but I have hundreds of data to be edited in a single file.

bluewind 02-13-2010 12:02 AM

Hi all,

Thanks for your guides so far. And pixellany I tried the hint you gave and it is really helpful.
This the code I edited and the output I am getting"

The code are:

#!/bin/sh

while read line; do
sed "/$line/s/$line/!$line/g" file2
done < file1


Test-1 Example:

input file: file1

g+h=o+p
a+b=c+d


file2 (file that need to look for)

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf

Output (what I am getting from this command)

a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf

!a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf

By using this command I am getting the output looping the same number of time as number of lines in the input file. Why is such? Right now I am still not very sure how to solve it.

And this is what the output should look like.

!a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf


Thanks in advance.

bluewind 02-13-2010 01:25 AM

Hi All,

I managed to get the output as wanted. Thanks to everyone especially pixellany. Your code really helpful.

This is the code I have modified to get the desired output:

#!/bin/sh

while read line; do
sed "/$line/s/$line/!$line/g" file2 >tempfile.tmp
mv tempfile.tmp file2
done < file1

Hence I get this output

!a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf

Thanks again :)

jschiwal 02-13-2010 11:32 AM

Your program will run sed for each line of input in file1.

Each time you run it you are running it on the same input file instead of saving what was done so far.

My idea is to use sed to process file1 to create a sed program.

sed '...' file1 >process.sed

This should produce these lines given your specific example:
s/g+h=o+p/!g+h=o+p/
s/a+b=c+d/!a+b=c+d/

Then you would run:
sed -f process.sed file2 >file3

There will be as many lines in process.sed as in your file1 file. If some lines in file1 are repeated, you could use "sort file1 | uniq" to eliminate redundancy.

I have left out the sed command I used to process your input file "file1".

I have used this technique at work to generate a script to delete files from a list of names (without the extension). There are sometimes up to 2000 files I need to delete from about 60 devices. Doing this using a graphical interface would take forever.

One script I make is run in Cygwin/X to delete files in a shared directory on a Windows server. Another is a script I upload to a number of devices. These devices have a custom OS so the command lines are different. I just need to produce the list and then convert it to each style of script.
So your example isn't a far fetched one. You may find it useful in the future.

pixellany 02-13-2010 11:49 AM

blue*;
I really like your attitude---you are picking up on things without us having to do the whole thing for you. You'd be amazed how many newcomers here can't seem to grab the ball and run with as you are. Keep it up, and you'll be an LQ guru in no time....

jschiwal 02-13-2010 12:07 PM

pixellany. I whole heartedly agree. I didn't dare supply more info because I feared he might get mad not being able to do it himself!

Somehow, I was responding to his second to last post.

bluewind 02-13-2010 02:36 PM

Hi All,

Firstly thanks pixellany and jschiwal for your encouragement.

And secondly I am currently trying the method suggested by jschiwal I will update here asap once i tried with that method. From the first glance i think it might solve the problem i am having right now [not sure yet :) ].

Anyway, thirdly, I just reliased that my input file (i.e file1) has metacharacters in it. The location of the metacharacters varies. So I have a mixed up input file1 that look like this:

file1

a+b=c+d 1e105
g+h=o+p abcdefg
rev/ 0.35 / h 35/
h2 / 20 / he / ar

file2

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf
rev/ 0.35 / h 35/
h2 / 20 / he / ar

and my output should look like this as mentioned before

!a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf
!rev/ 0.35 / h 35/
!h2 / 20 / he / ar

So the problem here is that when this "/" thing comes into the picture i am getting this error when i use previously posted command.

sed: -e expression #1, char 7: unknown command: `3'
sed: -e expression #1, char 7: unknown command: `3'
sed: -e expression #1, char 59: unknown command: `F'
sed: -e expression #1, char 11: extra characters after command

Hence I tried other method such as including "",'',and /\ but all did not work. I reffered websites get the sed manual. But the problem is that most of the example is to find singular thing in multiple files. eg s/samba/Mamba/g. SO they know what they looking for and where will the metacharacter will be present but this not for my case.


Example of failed method:
#!/bin/sh

while read line; do
sed "/$line/"s/"'$line'"/"'!$line'"/g file2
done < file1

Example2
#!/bin/sh

while read line; do
sed "#$line#s#$line\/#!$line\/#g" file2
done < file1

Example 3 [even tried this method but not enough experience I guess :)]

#!/bin/sh

while read line; do
sed {
if ( $line !=rev, $line !=Low, $line !=troe ) then
"/$line/s/$line/!$line/g" file2
else "#$line#s#$line\#!$line\#g" file2
fi
done < file1

For the problem I notice is that I need the "" for the sed to able to lacate the line and if i introduce '' it is disabled hence it cant locate the line to replace. How could I have the both option ON such that it can locate the line and reads the line as whole thing (h2 / 20 / he / ar) by ignoring "/"?

Thanks in advance

jschiwal 02-13-2010 06:35 PM

If you use
sed "s#pattern#replacepattern#", then the pattern and replacement can have slashes in them. In my solution, I did this as well.

The exclamation point can be a problem if you are working interactively. It is used for recalling commands from the history buffer. It needs to be escaped then typing commands on the command line interface but not if it's in a script. This can sometimes make debugging or designing a program more difficult.

Looking closer at one of your solutions,
Code:

sed "#$line#s#$line\/#!$line\/#g" file2
The first two octothorpes won't work. You need to use slashes for pattern matching before the sed command.
Since $line matches the entire line, you don't need to match a pattern to locate the line. The LHS contains the entire line.
Also, since the entire line is in the LHS, you don't need "g" at the end.

You can use have a pattern test in the beginning determine whether to execute the following
sed command.
/pattern/s ...
/pattern/!s ...,

The second one is run if the pattern isn't found.

---

By the way, your new information about the data causes the particular sed command I used not to work. Working with sed, awk & grep, the pattern of the input can be very important. Special conditions need to be tested for as well.

bluewind 02-14-2010 05:32 AM

Hi All,

Thanks jschiwal for your great input really helpful.
I have managed to solve the problem with this script finally :)
Soon will try to adapt your method of creating sed command using sed itself :)

I am posting this because it may be useful for others :) and as i mentioned before in my previous post, my
input file contains meta-characters in various places. Hence the solution I used:

#!/bin/sh

while read line; do
search="$line"
sed "s#$search#!$search#" file2 >tempfile.tmp
mv tempfile.tmp file2
done < file1

Input file1

a+b=c+d 1e105
g+h=o+p abcdefg
rev/ 0.35 / h 35/
h2 / 20 / he / ar

file2 (going to be changed file)

a+b=c+d 1e105
x+y=z+s 5e105
g+h=o+p abcdefg
t+r=w+q xvyderf
rev/ 0.35 / h 35/
h2 / 20 / he / ar

file2 (after changes made)

!a+b=c+d 1e105
x+y=z+s 5e105
!g+h=o+p abcdefg
t+r=w+q xvyderf
!rev/ 0.35 / h 35/
!h2 / 20 / he / ar

So with this script the problem with regards to meta-characters is solved :)

bluewind 02-26-2010 10:06 AM

cant get the last part
 
Quote:

Originally Posted by jschiwal (Post 3862597)
Your program will run sed for each line of input in file1.

Each time you run it you are running it on the same input file instead of saving what was done so far.

My idea is to use sed to process file1 to create a sed program.

sed '...' file1 >process.sed

This should produce these lines given your specific example:
s/g+h=o+p/!g+h=o+p/
s/a+b=c+d/!a+b=c+d/

Then you would run:
sed -f process.sed file2 >file3

There will be as many lines in process.sed as in your file1 file. If some lines in file1 are repeated, you could use "sort file1 | uniq" to eliminate redundancy.

I have left out the sed command I used to process your input file "file1".

I have used this technique at work to generate a script to delete files from a list of names (without the extension). There are sometimes up to 2000 files I need to delete from about 60 devices. Doing this using a graphical interface would take forever.

One script I make is run in Cygwin/X to delete files in a shared directory on a Windows server. Another is a script I upload to a number of devices. These devices have a custom OS so the command lines are different. I just need to produce the list and then convert it to each style of script.
So your example isn't a far fetched one. You may find it useful in the future.

Hi jschiwal,

I have been trying your suggestion for some time, but I keep get through the "wall" :)

At the moment i could get up to:
s/g+h=o+p/!
s/a+b=c+d/!

Using these commands:
#!/bin/sh
sed 's/^/s#/' <file1 >file2
sed 's/$/#!/' <file2 >file3

But I really dont know how to make it get the 1st part back, g+h=o+p/ back at the end of the line to finally get like you have mentioned:

s/g+h=o+p/!g+h=o+p/
s/a+b=c+d/!a+b=c+d/

Thanks in advance


All times are GMT -5. The time now is 07:41 AM.