LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 12-03-2008, 12:46 PM   #1
kingmaker2003
LQ Newbie
 
Registered: Dec 2008
Posts: 3

Rep: Reputation: 0
How to get data from xml files tags(from data tags)


hi I have a tag in XML file in unix like this

<EmailAddress>abc@gmail.com</EmailAddress>

this tag is there for multiple times in the xml file and the data is in continuous line like below

<State>UN</State><Zip/><CompanyName/><EmailAddress>FDF@gmail.COM</EmailAddress><PromoType>UNKNOWN</PromoType></Promotion></PromotionList<State>UN</State><Zip/><CompanyName/><EmailAddress>zd4946@gmail.com</EmailAddress>

I have to check the data in between bold tags is valid or not ... means have to check whether its a email address or not

and have to find the length of the attribute means tag ...script is in ksh

sorry if its already asked...i checked but i didnt get Exatly matching result for my requirement


any help in this

Last edited by kingmaker2003; 12-03-2008 at 01:28 PM.
 
Old 12-03-2008, 01:47 PM   #2
Telemachos
Member
 
Registered: May 2007
Distribution: Debian
Posts: 754

Rep: Reputation: 60
You could try to write a regular expression to do this, but parsing xml, html, etc. is notoriously difficult. If ksh is like Bash in terms of syntax, that sounds like trying to sculpt a piece of marble with a spoon. I would recommend looking at a scripting language with XML parsers available (eg, Perl, Python or Ruby).
 
Old 12-03-2008, 01:52 PM   #3
kingmaker2003
LQ Newbie
 
Registered: Dec 2008
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by Telemachos View Post
You could try to write a regular expression to do this, but parsing xml, html, etc. is notoriously difficult. If ksh is like Bash in terms of syntax, that sounds like trying to sculpt a piece of marble with a spoon. I would recommend looking at a scripting language with XML parsers available (eg, Perl, Python or Ruby).
I think we can get with awk ... I got the answer but works with 1st occurance of the <EmailAddress></EmailAddress> tag only

Code:
awk -F '</?EmailAddress>' '{print $2}' 456.xml
but i need for multiple times .... means email address tag exists for multiple times in the file ...
so need to check whole xml file for email address wherever <EmailAddress></EmailAddress> tag is present.

Last edited by kingmaker2003; 12-03-2008 at 02:42 PM.
 
Old 12-03-2008, 05:25 PM   #4
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
Concur with Telemachos
 
Old 12-04-2008, 08:33 AM   #5
kingmaker2003
LQ Newbie
 
Registered: Dec 2008
Posts: 3

Original Poster
Rep: Reputation: 0
Quote:
Originally Posted by chrism01 View Post
Concur with Telemachos
what that means
 
Old 12-04-2008, 05:54 PM   #6
chrism01
LQ Guru
 
Registered: Aug 2004
Location: Sydney
Distribution: Rocky 9.2
Posts: 18,359

Rep: Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751Reputation: 2751
http://dictionary.reference.com/dic?...&search=search
 
Old 12-04-2008, 07:26 PM   #7
xhypno
Member
 
Registered: Sep 2004
Posts: 62

Rep: Reputation: 16
Quote:
Originally Posted by kingmaker2003 View Post
I think we can get with awk ... I got the answer but works with 1st occurance of the <EmailAddress></EmailAddress> tag only

Code:
awk -F '</?EmailAddress>' '{print $2}' 456.xml
but i need for multiple times .... means email address tag exists for multiple times in the file ...
so need to check whole xml file for email address wherever <EmailAddress></EmailAddress> tag is present.
Take a look at egrep's multiline/return regex searching. It will allow you to parse the file for each occurrance of <></> and then pipe that to another egrep that uses -v and looks for <></>.
 
Old 12-04-2008, 11:12 PM   #8
paulsm4
LQ Guru
 
Registered: Mar 2004
Distribution: SusE 8.2
Posts: 5,863
Blog Entries: 1

Rep: Reputation: Disabled
kingmaker2003 -

I concur with Telemachos and Chrism01. Do yourself a big favor, and learn just enough Perl to parse a little bit of your file. Then see how easy it is to call Perl from your script. Just try it - and I think you'll concur, too.

IMHO .. PSM
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
MP3 ID3 Tag/Meta Data heri0n Linux - Software 2 08-28-2006 10:45 AM
need to get data from xml file to MySQL database, and then use php to access Armon Linux - General 1 01-18-2006 02:54 PM
suggestions for using xslt to view base64-encoded floating point data in an xml file? zero79 Programming 0 01-10-2006 06:52 PM
XSLT/xml enforcing full tag syntax bigearsbilly Programming 2 11-23-2005 02:38 AM
Parsing XML tags with php, can't get attributes of a tag jimieee Programming 1 05-05-2004 10:32 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 03:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration