sed search for a string, duplicate original and replace the string

Jykke · 06-10-2016, 05:10 AM

I have a file where I have some placeholders. I want
to duplicate the lines with a matching string, do a replacement on the new line and carry on.

For example, original file:
blablablaa
blablablaa
blablablaa
blablablaa
blablablaa<replaceme>blablablaa
blablablaa
blablablaa

now I can replace the <replaceme> with sed "s#<replaceme>#REPLACEMENT#"
but what I want is:
blablablaa
blablablaa
blablablaa
blablablaa
blablablaa<replaceme>blablablaa
blablablaaREPLACEMENTblablablaa
blablablaa
blablablaa

In the end I want to do it several times and in the end remove this original line, so that I have:
blablablaa
blablablaa
blablablaa
blablablaa
blablablaaREPLACEMENT_1blablablaa
blablablaaREPLACEMENT_2blablablaa
blablablaaREPLACEMENT_3blablablaa
blablablaaREPLACEMENT_4blablablaa
blablablaa
blablablaa

How can I accomplish this with sed?

syg00 · 06-10-2016, 05:22 AM

With great difficulty. Why the insistence on sed ?.

Jykke · 06-10-2016, 09:43 AM

That was the first that came into my mind - whereas I can accomplish few pathetic string operations with sed I am completely illitarate when it comes to awk or such things.

allend · 06-10-2016, 08:32 PM

You can use back references in sed to go from the original file to the intermediate.

Code:

sed 's#\(.*\)<replaceme>\(.*\)#\1<replaceme>\2\n\1REPLACEMENT\2#' <original>

Or, more concisely, using the & character that represents what was matched

Code:

sed 's#\(.*\)<replaceme>\(.*\)#&\n\1REPLACEMENT\2#' <original>

syg00 · 06-10-2016, 11:01 PM

Sed is good at substitution(s), but that incrementing number requirement makes it just an ugly hack. There are people here that will take that as a challenge - not me ....

I'm normally averse to just giving out solutions, but this tossed up some interesting side notes. Here's some awk that should hopefully be reasonably readable

Code:

awk '{if (!/<replaceme>/) {print ; next} ;  for (i=1; i<5; i++) { a=gensub(/<replaceme>/, "REPLACEMENT_"i, "1") ; print a}} ' input.file

Some assumptions I took:
- always substitute the found string.
- only first found on each line changed.
- 4 new lines substituted always.

Jykke · 06-10-2016, 11:36 PM

Sorry, I think it confused a bit that I made an increment into the replacement text. Actually I have a file with these placeholders.
I have another file with replacement strings. I read this second file line by line and want to duplicate the original line with placeholders and
simultaneously do the replacement for the duplicated line. This way when I get the next replacement line I can repeat the procedure.

So maybe a more accurate description would be:

Template with placeholders:
blablablaablablablaablablablaa
blablablaablablablaa
blablablaa<placeholder>blablablaablablablaa
blablablaablablablaablablablaa<placeholder>
blablablaa

replacement strings in the second file:
first_replacement
another_one

After first loop the file should look like:
blablablaablablablaablablablaa
blablablaablablablaa
blablablaa<placeholder>blablablaablablablaa
blablablaafirst_replacementblablablaablablablaa
blablablaablablablaablablablaa<placeholder>
blablablaablablablaablablablaafirst_replacement
blablablaa

After the second loop:
blablablaablablablaablablablaa
blablablaablablablaa
blablablaa<placeholder>blablablaablablablaa
blablablaaanother_oneblablablaablablablaa
blablablaafirst_replacementblablablaablablablaa
blablablaablablablaablablablaa<placeholder>
blablablaablablablaablablablaaanother_one
blablablaablablablaablablablaafirst_replacement
blablablaa

In the end I can delete the lines with placeholders. The amount of replacement strings in the second file may vary...
I think the solution might be already presented - I'll muck around a bit and we'll see...

The framework for the file operations I already have:

Code:

#!/bin/bash
rm -rf tmp.txt
touch tmp.txt
while read line; do
	sed for duplication of lines with <placeholder> and replacing <placeholder> with $line in the duplicate 
done < $1
sed to delete the lines with <placeholder>

Jykke · 06-11-2016, 12:31 AM

Ok, both seds from allend came very close to hacking it. The only problem is that it would not work if there were multiple <placeholder> on the same line.
Additionally I intend a second replacement operation with the counter, so one iteration more:

sourcefile:
blabla<id>blabla<placeholder>bla
blablabla<id>,<id>blabla
blablablablablablablablabla

replacement strings:
first_replacement
another_replacement

source should become:
blabla<id>blabla<placeholder>bla
blabla1blablafirst_replacementbla
blabla2blablasecond_replacementbla
blablabla<id>,<id>blabla
blablabla1,1blabla
blablabla2,2blabla
blablablablablablablablabla

Code:

#!/bin/bash
i=1
while read line; do
	sed -i 's#\(.*\)<placeholder>\(.*\)#&\n\1'$line'\2#' $2
	sed -i 's#\(.*\)<id>\(.*\)#&\n\1'$i'\2#' $2	
	i=$((i+1)) 
done < $1
sed to delete the lines with <placeholder>

The & is a good idea, but somehow the replacement need to be modified.

I could ease it a bit by saying that as a matter of fact there are limited amount of replacements,
actually a formating and I think I'll get there but a universal solution would be more educating

allend · 06-11-2016, 07:03 AM

The task has morphed from your original post. I suggest that you look at alternative tools such as awk or perl.
With awk, you could read the replacement strings into an array, then add lines as required while iterating over the array.

Code:

awk 'BEGIN {i=0;while ((getline s < "replacement_strings.txt") > 0) {i++;r[i]=s}} {print} /<placeholder>/ {for (i in r) {s=gensub(/<placeholder>/,r[i],"1");s=gensub(/<id>/,i,"g",s);print s}} /<id>,/ {for (i in r) {s=gensub(/<id>/,i,"g"); print s}}' sourcefile

sundialsvcs · 06-13-2016, 07:14 AM

Kindly remember a few things . . .

You have a dozen programming languages to choose from, and any of them are just a #!shebang away. sed, like "bash scripting," is suitable for only the most-trivial applications (IMHO), and, while awk is an advance over this, "the world is your oyster."
"One-liners," generally, are Evil. (IMHO.) It's okay for your solution to consist of more than one file.
Above all, design and write your solution to be durable, flexible, and maintainable. Please don't write "write-only code."

I've more-or-less made my living out of cleaning-up after people.

onebuck · 06-13-2016, 07:49 AM

Moved: This thread is more suitable in <Programming> and has been moved accordingly to help your thread/question get the exposure it deserves.

allend · 06-13-2016, 08:35 AM

If the replacement strings are in a file named 'replacement_strings.txt',
and the source file is named 'sourcefile',
then a file named 'replace.awk' containing the code below (multiline version of post #8)
can be used with the command 'awk -f replace.awk sourcefile',
to achieve the requested output.

Code:

# Read contents of replacement_strings.txt file into array r[] 
BEGIN {
  i=0
  while ((getline s < "replacement_strings.txt") > 0)
    {i++;r[i]=s}}

# Echo all lines from input file
{print}

# If line contains '<placeholder>', add lines with substitutions
# for <placeholder> for each element of array r[] and
# for <id> with a numeric identifier of the element of the array
/<placeholder>/ {
  for (i in r)
    {s=gensub(/<placeholder>/,r[i],"1")
    s=gensub(/<id>/,i,"g",s)
    print s}}

# If line contains '<id>,', add lines with substitutions
# for <id> with a numeric identifier for each element of array r[]
/<id>,/ {
  for (i in r)
    {s=gensub(/<id>/,i,"g")
    print s}}

HMW · 06-13-2016, 09:36 AM

Quote:

Originally Posted by sundialsvcs

"One-liners," generally, are Evil. (IMHO.) It's okay for your solution to consist of more than one file.[*] Above all, design and write your solution to be durable, flexible, and maintainable.

^This should be axiomatic. Well written!

Best regards,
HMW

sundialsvcs · 06-13-2016, 02:56 PM

Quote:

Originally Posted by allend

If the replacement strings are in a file named 'replacement_strings.txt',
[...]
then ...

"Hear! Hear!"

IMHO, a very critical point has been made here: "the solution has now been generalized!"

"If When(!), maybe someday many times in the future, someone needs to replace something else," they might only need to add a single line into a text file.

Furthermore, in order to convince themselves that their actions will produce the expected result, they need only examine the [i]well-formatted, easily readable, awk file with comments(!)" in order to reliably understand what's going on.

... and "if, somehow, that was what I could actually count on, in my job" ...

...

(a) I would feel like I had just died and gone to Heaven, and ...

(b) I probably wouldn't have a job.

JockVSJock · 06-20-2016, 08:46 AM

Quote:

Originally Posted by sundialsvcs

"One-liners," generally, are Evil. (IMHO.) It's okay for your solution to consist of more than one file.[*]

I'm curious why your against one-liners.

I've seen other posts where people brag that they have optimize others code into one line and I've also seen job posts where they ask "what is the one-liner you are proud of most and why?"

chrism01 · 06-29-2016, 05:48 AM

One liners are a fun exercise at home to see what you can do, but as a long time worker in the industry, in a commercial env they are (generally) a pita; hard to debug and maintain.
Unintended side-effects are a common problem, as are unexpected results with impure inputs.

K.I.S.S. https://en.wikipedia.org/wiki/KISS_principle