LinuxQuestions.org - [SOLVED] using sed to extrac data from xml tags and make the result displayed in one line

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - using sed to extrac data from xml tags and make the result displayed in one line (https://www.linuxquestions.org/questions/linux-newbie-8/using-sed-to-extrac-data-from-xml-tags-and-make-the-result-displayed-in-one-line-875777/)

using sed to extrac data from xml tags and make the result displayed in one line

I have an xml file that is similar to this.
Suppose that this file name is Example.

<PMID>10605436</PMID>
<Year>2000</Year>
<ArticleTitle>Steroids</ArticleTitle>
<MedlinePgn>255-60</MedlinePgn>
<AbstractText>Steroids Abstracts </AbstractText>
<PMID>10605437</PMID>
<Year>2001</Year>
<ArticleTitle>Hormone</ArticleTitle>
<MedlinePgn>123-34</MedlinePgn>
<AbstractText>Hormones Abstracts</AbstractText>

I used
sed -n -e 's/.*<PMID>$.*$<\/PMID>.*/\1/p'
-e 's/.*<ArticleTitle>$.*$<\/ArticleTitle>.*/\1/p'
-e 's/.*<AbstractText>$.*$<\/AbstractText>.*/\1/p'
Example

I get the output
10605436
Steroids
Steroids Abstracts
10605437
Hormone
Hormones Abstracts

How do I modify my sed command so that it prints my needed information in one line, i.e.
10605436 Steroids Steroids Abstracts
10605437 Hormone Hormones Abstracts

Hi, welcome to LQ!

And because I know awk better than sed ... ;}

Code:

awk '{payload=gensub(/[^>]+>([^<]+).*/, "\\1", "1")}/PMID|ArticleTitle/{printf "%s\t",payload}/AbstractText/{printf "%s\n",payload}'

Cheers,
Tink

Or maybe:

Code:

awk -F"[><]" '/PMID|ArticleTitle|AbstractText/{ORS=/AbstractText/?"\n":" ";print $3}' file

Using GNU sed.
The h and H commands build the output in the hold space.
The g command copies the contents of the hold space back into the pattern space.
s/\n/ /g replaces the newlines with spaces.

Code:

sed -n '/<PMID>/{s/.*>\(.*\)<.*/\1/;h}

/<ArticleTitle>/{s/.*>\(.*\)<.*/\1/;H}

/<AbstractText>/{s/.*>\(.*\)<.*/\1/;H;g;s/\n/ /g;p}'

Many thanks to all. It works perfectly!!

Please mark as SOLVED if you have a solution.