LinuxQuestions.org
Help answer threads with 0 replies.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-19-2010, 05:13 AM   #1
jkmaster
LQ Newbie
 
Registered: Jan 2010
Posts: 5

Rep: Reputation: 0
Sed / Replace multiline, multiple instances


Hello,

I've written a sed command to match and replace a block of five lines.
Until now, the command only replaces one instance of the five lines.

What do I need to change to replace all instances? The instances are not entirely the same, but do match the regular expressions.

Here is the command file (used with sed -n -f).
For better clarity, I do not include the actual regexes.

Code:
/regex to match the first line/ {
H
# append 2nd line
n
H
# append 3rd line
n
H
# append 4th line
n
H
# append 5th line
n
H
# get the contents of the Hold buffer, then replace
g
s/regex to match the five lines/regex to replace the five lines/
# clear the Hold buffer
x;s/.*//;x;
}
p
Using Sed GnuWin32 on Windows XP.

Thanks in advance for your help.

jk
 
Old 01-19-2010, 07:19 AM   #2
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
This strikes me as doing things the hard way, but it should work.

Some observations:
1. If the first movement to the hold buffer is "h" and not "H", then you don't need the code to clear the hold buffer.

2. Normally, when you combine multiple lines, you would strip out the newlines before attempting further actions.

3. I don't understand the "p" outside of the {...} construct.

Have you tried this in a script instead of having sed call a file? (I don't know if this could be relevant.)

The best was to get help on this is to post a sample of the actual file and the specific changes you want to make.
 
1 members found this post helpful.
Old 01-19-2010, 08:39 AM   #3
jkmaster
LQ Newbie
 
Registered: Jan 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Thanks a lot for your reply.

I have used sed before for basic find-and-replace tasks, this one being more ambitious. That's why I'm probably missing out on certain skills or best practices in sed scripting.

Anyway, here's what I'm trying to do:

Starting with a file generated in a markup language (FrameMaker MIF), I'd like to
- take some string values (highlighted in red), and
- add the same values inside a different markup element to the original file (the whole additions highlighted in blue).

The original MIF file has 30000+ lines, so here's an excerpt with the relevant bits on which you can try out the script:

Code:
 <MIFFile 8.00>
 ----- snip -----
  <XRef 
  <XRefName `Navigation 4'>
  <XRefSrcText `CHDEEGGJ'>
  <XRefSrcIsElem Yes>
  <XRefSrcFile `<c\>test2.xml'>
  <XRefLastUpdate 1263890843 26000>
  <Unique 1007728>
  <Element 
   <Unique 1007731>
   <ETag `xref'>
   <Attributes 
   <Attribute 
    <AttrName `IDREF'>
    <AttrValue `CHDEEGGJ'>
   > # end of Attribute
   > # end of Attributes
   <Collapsed No>
   <SpecialCase No>
   <AttributeDisplay ReqAndSpec>
  > # end of Element
  > # end of XRef
  ----- snip -----
  <XRef 
  <XRefName `Heading & Page'>
  <XRefSrcText `CHDHFFJF: cname: 1.1.1.2 Sample Heading'>
  <XRefSrcIsElem Yes>
  <XRefSrcFile `<c\>test2.xml'>
  <XRefLastUpdate 1263801131 424000>
  <Unique 1011275>
  <Element 
   <Unique 1011278>
   <ETag `xref'>
   <Attributes 
   <Attribute 
    <AttrName `IDREF'>
    <AttrValue `CHDHFFJF'>
   > # end of Attribute
   > # end of Attributes
   <Collapsed No>
   <SpecialCase No>
   <AttributeDisplay ReqAndSpec>
  > # end of Element
  > # end of XRef
  ----- snip -----
  # EOF
The intended result should look like this, however, I get only the first of the two blocks highlighted in blue:

Code:
 <MIFFile 8.00>
 ----- snip -----
  <Marker
  <MType 12>
  <MTypeName `UnstructXRef'>
  <MText `;;Navigation 4;;CHDEEGGJ;;test2.xml;;' >
  > # end of Marker
  <XRef 
  <XRefName `Navigation 4'>
  <XRefSrcText `CHDEEGGJ'>
  <XRefSrcIsElem Yes>
  <XRefSrcFile `<c\>test2.xml'>
  <XRefLastUpdate 1263890843 26000>
  <Unique 1007728>
  <Element 
   <Unique 1007731>
   <ETag `xref'>
   <Attributes 
   <Attribute 
    <AttrName `IDREF'>
    <AttrValue `CHDEEGGJ'>
   > # end of Attribute
   > # end of Attributes
   <Collapsed No>
   <SpecialCase No>
   <AttributeDisplay ReqAndSpec>
  > # end of Element
  > # end of XRef
  ----- snip -----
  <Marker
  <MType 12>
  <MTypeName `UnstructXRef'>
  <MText `;;Heading & Page;;CHDHFFJF: cname: 1.1.1.2 Sample Heading;;test2.xml;;' >
  > # end of Marker
  <XRef 
  <XRefName `Heading & Page'>
  <XRefSrcText `CHDHFFJF: cname: 1.1.1.2 Sample Heading'>
  <XRefSrcIsElem Yes>
  <XRefSrcFile `<c\>test2.xml'>
  <XRefLastUpdate 1263801131 424000>
  <Unique 1011275>
  <Element 
   <Unique 1011278>
   <ETag `xref'>
   <Attributes 
   <Attribute 
    <AttrName `IDREF'>
    <AttrValue `CHDHFFJF'>
   > # end of Attribute
   > # end of Attributes
   <Collapsed No>
   <SpecialCase No>
   <AttributeDisplay ReqAndSpec>
  > # end of Element
  > # end of XRef
  ----- snip -----
Finally, here is the original sed script from my input file:

Code:
/<XRef\s$/ {
h
n
# 2
H
n
# 3
H
n
# 4
H
n
# 5
H
g
s/\(.*\s<XRef.*\s<XRefName\s`\([A-z0-9 ]*\)'>.*\s<XRefSrcText\s`\([A-z0-9 ]*\)'>.*\s<XRefSrcFile\s`\([A-z0-9<>\\ ]*\)'>\)/<Marker\n<MType 12>\n<MTypeName `UnstructXRef'>\n<MText ;;\2;;\3;;\4;; >\n> \# end of Marker\n\1/
}
p
Leading whitespace is not critical in the output.

I find it convenient using an input file but would be just as happy with a working command line script. I've already changed the first H to lowercase and removed removed the line for cleaing the hold buffer.

Thanks again for listening. Maybe you could give me some more hints how to proceed?


jk

Last edited by jkmaster; 01-19-2010 at 08:40 AM.
 
Old 01-19-2010, 10:43 AM   #4
pixellany
LQ Veteran
 
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809

Rep: Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743Reputation: 743
Why not do the changes line by line?

eg:

sed -e 's/old1/new1/' \ #1st line
-e 's/old2/new2/' \ #2nd line
-e 's/old3/new3/'
.
.
.
etc.

Last edited by pixellany; 01-19-2010 at 01:48 PM. Reason: fixed typos (missing "s" in 2 places)
 
Old 01-19-2010, 09:57 PM   #5
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Code:
awk '
$1=="<XRef" {
    o=$0
    for(i=1;i<=4;i++){
        getline
        if(i==3) continue
        gsub(/.* \047|\047>|.*<c\\>/,"")
        s=s $0";;"
    }
    string=s"\047 >"
    printf "<Marker\n<MType 12>\n<MTypeName \047UnstructXRef\047\n"
    print "<MText \047" string
    print " > # end of Marker "
    print o;next
    s=""
}1 ' file

Last edited by ghostdog74; 01-20-2010 at 03:52 AM.
 
1 members found this post helpful.
Old 01-20-2010, 02:53 AM   #6
jkmaster
LQ Newbie
 
Registered: Jan 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Thanks very much for your replies.

Pixellany,
I could assemble the 'blue' blocks using line-by-line operations, but is it possible at the same time to keep the original lines as consecutive lines? How could I proceed?


ghostdog74,
As I have no experience with awk so far, I just tried out your sample script without really understanding what it does in particular. The output get a bit mixed up (the original lines do not stay in place, and the substrings extracted from the 1st instance are inserted in the "MText" line for the 2nd instance), but I'll try to give it a shot.

jk
 
Old 01-20-2010, 03:54 AM   #7
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,697
Blog Entries: 5

Rep: Reputation: 244Reputation: 244Reputation: 244
Quote:
Originally Posted by jkmaster View Post
ghostdog74,
As I have no experience with awk so far, I just tried out your sample script without really understanding what it does in particular. The output get a bit mixed up (the original lines do not stay in place, and the substrings extracted from the 1st instance are inserted in the "MText" line for the 2nd instance), but I'll try to give it a shot.

jk
you have backticks in your file, like
Code:
<XRefName `Navigation 4'>
is that correct? or should it really be single quote. I had changed all backticks to single quote for my testing. Therefore if you don't get the correct results , most probably is the backticks.
 
Old 01-21-2010, 05:56 AM   #8
jkmaster
LQ Newbie
 
Registered: Jan 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by ghostdog74 View Post
you have backticks in your file, like
Code:
<XRefName `Navigation 4'>
is that correct?
Yes, these are to be required by the file format. I tried the awk script with a changed sample and it looks better. Still, the original lines from which the substrings are extracted must be retained.

Right at the moment I'm busy with something else, but I'll post again how far I got with sed or awk.

jk
 
Old 01-28-2010, 09:00 AM   #9
jkmaster
LQ Newbie
 
Registered: Jan 2010
Posts: 5

Original Poster
Rep: Reputation: 0
Just wanted to give an update, as I just got it working after staring at the sed command very intensely for a few minutes ...
My multiline script was not the wrong at all, except that the search regex didn't work for all instances .
It works like a charm when I change the following --

Code:
[A-z0-9 ]   <= old search regex
[^\n\r']    <= improved search regex
So this is the sed input file in its entirety:
Code:
/<XRef\s$/ {
h
n
# 2
H
n
# 3
H
n
# 4
H
n
# 5
H
g
s/\(.*\s<XRef.*\s<XRefName\s`\([^\r\n']*\)'>.*\s<XRefSrcText\s`\([^\r\n']*\)'>.*\s<XRefSrcFile\s`\([^\r\n']*\)'>\)/<Marker\n<MType 12>\n<MTypeName `UnstructXRef'>\n<MText ;;\2;;\3;;\4;; >\n> \# end of Marker\n\1/
}
p
Thanks again for your suggestions.

jk
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
sed - multiline search/replace with wildcard troubles Yalla-One Programming 4 12-29-2008 12:01 PM
sed 1st word to replace all instances of ( donnied Programming 5 08-21-2008 06:43 PM
need sed help - how to replace all instances of X except those on lines with Y? BrianK Programming 4 03-25-2008 06:49 PM
file renaming question--replace multiple instances David the H. Linux - General 4 01-01-2008 12:05 AM
replacement with sed: replace pattern with multiple lines Hcman Programming 5 11-18-2004 07:40 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 06:26 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration