LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Programming (https://www.linuxquestions.org/questions/programming-9/)
-   -   SED multiline pattern matching (https://www.linuxquestions.org/questions/programming-9/sed-multiline-pattern-matching-740934/)

AutoC 07-17-2009 11:26 PM

SED multiline pattern matching
 
Hi,
I have an xml file that has multiple entries like..

Quote:

<recording audio="011/011o031e.sph" name="1e">
<segment lang="en_US">
<orth>
Midlantic ,comma a New Jersey bank stock ,comma was up three
quarters to forty seven and three eighths .period
</orth>
</segment>
</recording>
I need to parse each entry, create a file based on audio="",for eg. I need to create 011/011o031e.t and write the stuff between <orth> and </orth> in the file.
any help?

ghostdog74 07-18-2009 12:04 AM

awk
Code:

awk '/audio/{
 for(i=i;i<=NF;i++){
    if($i ~ /audio/){
        gsub(/.*audio=\"|\//,"",$i)
        gsub(/\..*\".*/,"",$i)
        filename = $i".t"
    } 
 }
}
/<\/orth>/{f=0;}
/<orth>/{f=1;next}
f {  print $0 > filename }' file



All times are GMT -5. The time now is 08:27 PM.