LinuxQuestions.org
Review your favorite Linux distribution.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-09-2012, 09:31 PM   #1
adamzuber
LQ Newbie
 
Registered: Apr 2012
Posts: 4

Rep: Reputation: Disabled
Parse XML tags with attribute with SED


Hi people, I've been searching all over the net for the correct regex/command, but cant find anything..

file.xml
Code:
<koko value="92029">
<arnab>5</arnab>
<kambing>3</kambing>
</xoxo>
<xoxo value="13245">
<kambing>2</kambing>
<kambing>3</kambing>
</xoxo>
<popo value="12345">
<kambing>2</kambing>
<kambing>3</kambing>
</popo>
I would like to extract only a spesific tag, including its tag attribute. Eg:
Code:
<xoxo value="13245">
<kambing>2</kambing>
<kambing>3</kambing>
</xoxo>
I tried egrep before with this regex, but nothing came up.
Code:
egrep '<xoxo[\s\S]*?/xoxo>' file.xml
Currently Im working for a one line solution like sed or egrep. I heard awk could do too but regex is too difficult for me to understand. Any hint on the solution is pretty much appreciated.

Thanks,
Adam.
 
Old 04-09-2012, 10:01 PM   #2
jhwilliams
Senior Member
 
Registered: Apr 2007
Location: Portland, OR
Distribution: Debian, Android, LFS
Posts: 1,168

Rep: Reputation: 211Reputation: 211Reputation: 211
Awk would be a better next step, but I propose to use the xpath command. It's a wrapper around Perl's XML parser.

It assumes, though, that your document is valid XML. E.g.:

Code:
<?xml version="1.0"?>
<root>
  <koko value="92029">
    <arnab>5</arnab>
    <kambing>3</kambing>
  </koko>
  <xoxo value="13245">
    <kambing>2</kambing>
    <kambing>3</kambing>
  </xoxo>
  <popo value="12345">
    <kambing>2</kambing>
    <kambing>3</kambing>
  </popo>
</root>
Then:

Code:
jameson@yellow:~$ xpath -q -e '/root/xoxo' input.xml 
<xoxo value="13245">
  <kambing>2</kambing>
  <kambing>3</kambing>
</xoxo>
 
Old 04-09-2012, 10:22 PM   #3
adamzuber
LQ Newbie
 
Registered: Apr 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
Hi jhwilliams, thanks for the reply. I manage to get what I want from the xpath, but, what if it is not a valid xml file? This is because I have few data that i have to merge.

Example:
Code:
cat file1.xml file2.xml > file3.xml
Assuming file1.xml and file2.xml is a valid xml file, now that i have file3.xml, which is not a valid xml file. I have tested with the 'xpath' but it breaks because of the invalid xml file and path. Any workaround?
 
Old 04-09-2012, 10:36 PM   #4
adamzuber
LQ Newbie
 
Registered: Apr 2012
Posts: 4

Original Poster
Rep: Reputation: Disabled
Solved using sed.

Code:
sed '/<xoxo/,/<\/xoxo>/!d' notvalidxml.xml
Many thanks,
Adam.
 
Old 04-09-2012, 10:37 PM   #5
firstfire
Member
 
Registered: Mar 2006
Location: Ekaterinburg, Russia
Distribution: Debian, Ubuntu
Posts: 709

Rep: Reputation: 428Reputation: 428Reputation: 428Reputation: 428Reputation: 428
Removed
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extarct tags with multiline values from XML file using sed/Awk gbms Linux - Newbie 3 03-27-2012 10:18 AM
[Grep,Awk,Sed]Parsing text between XML tags. ////// Programming 5 07-26-2011 11:54 AM
[SOLVED] using sed to extrac data from xml tags and make the result displayed in one line pikcolo Linux - Newbie 5 04-20-2011 01:27 AM
extract attribute value from xml using bash testac Programming 6 06-21-2010 05:35 AM
setting XML attribute using parameter in XSLT mohtasham1983 Programming 2 01-03-2008 04:03 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 04:28 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration