LinuxQuestions.org
Visit the LQ Articles and Editorials section
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices



Reply
 
Search this Thread
Old 03-27-2012, 04:39 AM   #1
gbms
LQ Newbie
 
Registered: Mar 2012
Posts: 1

Rep: Reputation: Disabled
Extarct tags with multiline values from XML file using sed/Awk


Hi,

I have some XML file which holds data-value pairs(basically, a Java properties file in XML) as shown below.
This file contains both single line tags and multiline tags.

<entry key="KEY1"> tag1 value </entry>
<entry key="KEY2" > hello
world. This is multiline tag example.
blahh blah blah...
</entry>

I want to extract the tag value by passing tag the name from bash script.
Could somebody give me some pointers to extract multiline value of a tag ?



Thanks,
gbms
 
Old 03-27-2012, 05:57 AM   #2
grail
Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 7,698

Rep: Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988Reputation: 1988
This might get you going:
Code:
awk '{print "|"$0"|"}' RS="[<>\n]+" file
Generally though your probably better off with Perl or Ruby as they have xml parsers which they can use.
 
Old 03-27-2012, 06:37 AM   #3
catkin
LQ 5k Club
 
Registered: Dec 2008
Location: Tamil Nadu, India
Distribution: Servers: Debian Squeeze and Wheezy. Desktop: Slackware64 14.0. Netbook: Slackware 13.37
Posts: 8,563
Blog Entries: 29

Rep: Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179Reputation: 1179
XMLStarlet has been recommended on LQ. I haven't needed to use it yet so cannot say how good it is etc.
 
Old 03-27-2012, 11:18 AM   #4
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
xml and html data structures are (generally) free-form in terms of whitespace and can contain nested values, both of which are difficult-to-impossible for regular expression and line-based programs like sed or awk to parse reliably.

So unless your extraction requirements are trivial and the input is guaranteed to be well-formed and uniform, you're much better off working with tools specifically designed for those languages, as suggested above.

xmlstarlet is probably a good place to start. Like catkin, I don't know much about it personally, but it has a good set of documentation here:

http://xmlstar.sourceforge.net/docs.php

Also, please use [code][/code] tags around your code and data, to preserve formatting and to improve readability. Please do not use quote tags, colors, or other fancy formatting.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] awk or sed to use CSV as input and XML as template and output to a single file bridrod Linux - Newbie 6 03-13-2012 08:00 PM
iter around same tags in xml with awk frambau Programming 15 02-10-2012 07:28 AM
how to modify xml file using sed/awk akhand jyoti Linux - Newbie 3 11-29-2011 03:47 PM
[Grep,Awk,Sed]Parsing text between XML tags. ////// Programming 5 07-26-2011 12:54 PM
how to delete duplicates entries in xml file using sed/awk/sort ? catzilla Linux - Software 1 10-28-2005 03:57 PM


All times are GMT -5. The time now is 08:25 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration