[SOLVED] Replace 2nd occurrence of a string in a file

kushalkoolwal · 04-05-2010, 01:56 PM

So I know how to replace a particular instance (say 3rd one) of a word in a line using sed based on the sed one-liners. However I would like to replace a particular instance of a word in the entire file. I tried searching on the Internet but did not find anything useful.

For example, here is a file:

Code:

John
Betty
Jack
Ron
Jack
Paul

So now I would like to replace the second instance of Jack (in red color) with "Rob" (for example). Not quite sure how to do that? I tried couple of things from here but they did not work.

Thanks

David the H. · 04-05-2010, 02:42 PM

Try this.

Code:

sed  '0,/Jack/! s/Jack/Rob/' file.txt

The exclamation mark negates everything from the beginning of the file to the first "Jack", so that the substitution operates on all the following lines. Note that I believe this is a gnu sed operation only.

If you need to operate on only the second occurrence, and ignore any subsequent matches, you can use a nested expression.

Code:

sed  '0,/Jack/! {0,/Jack/ s/Jack/Rob/}' file.txt

Here, the bracketed expression will operate on the output of the first part, but in this case, it will exit after changing the first matching "Jack".

PS, I've found the sed faq to be very helpful in cases like this.

ghostdog74 · 04-05-2010, 06:52 PM

use awk, where you can maintain a count of things. Its also easier to change if your future requirement is not the second instance.

Code:

awk '/Jack/{c++;if(c==2){sub("Jack","Rob");c=0}}1' file

Star_Gazer · 04-06-2010, 08:00 PM

Interesting...

In the page of the document,
info:/sed/The "s" Command
which can be loaded in Konquerer.

Code:

The `s' command can be followed by zero or more of the following FLAGS: 
 `g'
      Apply the replacement to _all_ matches to the REGEXP, not just the
      first.


 `NUMBER'
      Only replace the NUMBERth match of the REGEXP.


      Note: the POSIX standard does not specify what should happen when
      you mix the `g' and NUMBER modifiers, and currently there is no
      widely agreed upon meaning across `sed' implementations.  For GNU
      `sed', the interaction is defined to be: ignore matches before the
      NUMBERth, and then match and replace all matches from the NUMBERth
      on.

Though I haven't quite got a grasp on it.

David the H. · 04-07-2010, 06:36 AM

When you add a number after the substitution expression, the replacement will happen to the nth occurrence of the pattern on that line; just as 'g' will affect all occurrences on the same line. Combining the two makes it affect all matches from that position on.

Code:

testline="foo bar foo bar foo bar foo bar"

$ echo "$testline" | sed 's/foo/FOO/'
FOO bar foo bar foo bar foo bar

$ echo "$testline" | sed 's/foo/FOO/3'
foo bar foo bar FOO bar foo bar

$ echo "$testline" | sed 's/foo/FOO/g'
FOO bar FOO bar FOO bar FOO bar

$ echo "$testline" | sed 's/foo/FOO/3g'
foo bar foo bar FOO bar FOO bar

Sed works by copying a single line into it's pattern buffer, processing that line, then clearing it and moving on to the next line. This means that the s/// expression on its own cannot affect multiple lines. That's what the addressing expressions and the hold buffer are for.

Good catch on a poorly-known function though.

syg00 · 04-07-2010, 06:57 AM

ghostdog74, unless I'm misunderstanding (quite likely), won't that change every second occurrence, not just the second one ?.

Star_Gazer · 04-07-2010, 07:43 AM

Quote:

Originally Posted by David the H.

When you add a number after the substitution expression, the replacement will happen to the nth occurrence of the pattern on that line; just as 'g' will affect all occurrences on the same line. Combining the two makes it affect all matches from that position on.

Code:

testline="foo bar foo bar foo bar foo bar"

$ echo "$testline" | sed 's/foo/FOO/'
FOO bar foo bar foo bar foo bar

$ echo "$testline" | sed 's/foo/FOO/3'
foo bar foo bar FOO bar foo bar

$ echo "$testline" | sed 's/foo/FOO/g'
FOO bar FOO bar FOO bar FOO bar

$ echo "$testline" | sed 's/foo/FOO/3g'
foo bar foo bar FOO bar FOO bar

Sed works by copying a single line into it's pattern buffer, processing that line, then clearing it and moving on to the next line. This means that the s/// expression on its own cannot affect multiple lines. That's what the addressing expressions and the hold buffer are for.

Good catch on a poorly-known function though.

Ah, thanks for pointing that out

It was driving me batty when I kept trying to test it, thus, I gave up.

I'll note this for future references.

grail · 04-07-2010, 09:45 AM

Quote:

ghostdog74, unless I'm misunderstanding (quite likely), won't that change every second occurrence, not just the second one ?.

Yes but if you take out the c=0 at the end, problem solved

ghostdog74 · 04-07-2010, 08:17 PM

Quote:

Originally Posted by syg00

ghostdog74, unless I'm misunderstanding (quite likely), won't that change every second occurrence, not just the second one ?.

yes, i interpreted OP's req'd that way. If its only the 2nd instance, then as grail mentioned, remove the c=0

typer100 · 05-02-2011, 01:38 PM

Old thread, but I have another question. I'll use the current example...

testline="foo bar foo bar foo bar foo bar"

Let say I want to replace occurrence 2 and 4 with FOO. I can obviously this with 2 commands...

echo "$testline" | sed 's/foo/FOO/2'|sed 's/foo/FOO/3'
foo bar FOO bar foo bar FOO bar

It works but, any chase for a one liner... s/foo/FOO/2,4

colucix · 05-02-2011, 02:30 PM

Code:

echo "$testline" | sed -r 's/(.*foo.*)foo(.*foo.*)foo/\1FOO\2FOO/'
foo bar FOO bar foo bar FOO bar

kropex · 06-10-2020, 06:29 AM

Why this thread is SOLVED? I really don't see the solution even if the starting subject is very well explained. Just Nth element replace in a file not line using sed is a challenge over entire internet searching. This thread is very good subject, it is solved but can't get the solution.

shruggy · 06-10-2020, 06:36 AM

Sorry? The awk solution is in #3, by ghostdog74. Amended by grail in #8. Admittedly, the solution is not perfect as it doesn't provide for the case when the second occurrence happens on the same line as the first.

Recent GNU sed versions (4.2.2+) support NULL terminated lines, so for text files of reasonable size this should work better:

Code:

sed -z 's/Jack/Rob/2' file

If we're speaking about replacing words (however they have been defined) as opposite to arbitrary strings then awk would cut it better again:

Code:

BEGIN{
  FS = "[^[:alnum:]]+"
}
{
  for (i=1; i<=NF; i++){
    word[$i]++
    if (word[$i]==2 && $i=="Jack") $i="Rob"
  }
  print
}

kropex · 06-10-2020, 03:11 PM

Ok, let's give the solution expected by anyone, also expected by me, I worked a lot to solve it in this way.
The idea is to obtain the line number with the searched word in file and to replace directly at the line found.

Code:

    #first line here generate a unique MAC for my file
    uniqueMAC=$(echo 42:21:10$(hexdump -n3 -e '/1 ":%02x"' /dev/random))

    #next line my variable get the format which should be replaces
    MACxml="address='$uniqueMAC'\/>" #the \ here is escape for this line when it is used in sed

    #next line find the lines which contain the string and get the line number, there could be more than one number
    mod_lines=$(grep -n 'address=' "$myfile.xml" |cut -d':' -f1)

    #next line is pushing the line numbers from string into array to address individually as I need
    no_lines=($(echo $mod_lines | tr " " "\n"))

    #finally here is the sed which replace the text exactly at the line number I want
    sed -i ''${no_lines[1]}' s/address=.*/'$MACxml'/' $myfile.xml; 
#here above index [1] is second element line number, and the replace will occur for the text 'address=' and all text after in that line

The code above was tested and works great, I hope will help a lot of people and we can consider this thread really SOLVED.
Thanks to anyone especially to thread initiator.

shruggy · 06-10-2020, 05:53 PM

Glad you sorted it out. But sed is a wrong tool to parse XML. In your particular case, this really feels like another example of the XY problem.

Suppose your input data look like this (example taken from the libvirt docs):

Code:

<domain type='qemu'>
  <devices>
    <interface type='server'>
      <mac address='52:54:00:22:c9:42'/>
      <source address='192.168.0.1' port='5558'/>
    </interface>
    <interface type='client'>
      <mac address='52:54:00:8b:c9:51'/>
      <source address='192.168.0.1' port='5558'/>
    </interface>
  </devices>
</domain>

Then, to change the MAC address on the second interface, do

Code:

xmlstarlet ed -u "//interface[@type='client']/mac/@address" -v 52:54:00:8b:c9:43 myfile.xml