LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 07-10-2014, 06:22 AM   #1
santosh0782
Member
 
Registered: Nov 2013
Posts: 132

Rep: Reputation: Disabled
how to pull only values from the xml tags?


I have a file test.txt
Code:
cat test.txt

<data_block><customer>P1584 - INGENICO SOUTH COOP           </customer><Connect_Type>Internet       </Connect_Type><description>,                          </description>
<File>003 to BMS at 02:10</File>                                                                                                                                        
<File>023 to RBS at 03:10</File>                                                                                                                                        
<File>004 to BMS at 04:10</File>                                                                                                                                        
<File>007 to BMS at 05:01</File>                                                                                                                                        
<File>002 to BMS at 05:01</File>                                                                                                                                        
<File>001 to BMS at 05:01</File>                                                                                                                                        
<ACQ><ACQ_Detail>BMS - Expected  2 files between 02:00 - 05:30</ACQ_Detail><ACQ_Found>  5</ACQ_Found>                                                                   
<ACQ_TEXT>                         </ACQ_TEXT><ACQ_message>Y                             </ACQ_message></ACQ>                                                           
</data_block>
i want to retrieve few values from this xml file, removing leading and trailing spaces.

e.g.

1. to pull value in <ACQ_message> i tried:

$ sed -n "/P1584/,/<\/data_block>/p" "test.txt"|grep "<ACQ_message>"|awk -F ' ' '{print $2}'|cut -c25-35
output:
Y

what is the best way to get only value inside the <ACQ_message>? removing leading and trailing spaces, because value could be of any long characters

2. similar way i want to pull only value inside the <ACQ_Found> tag, however i tried:
$ sed -n "/P1584/,/<\/data_block>/p" "test.txt"|grep "<ACQ_Found>"|awk -F '-' '{print $3}'
output:
05:30</ACQ_Detail><ACQ_Found> 5</ACQ_Found>


could someone please help?

Last edited by santosh0782; 07-10-2014 at 06:23 AM.
 
Old 07-10-2014, 06:34 AM   #2
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 8,644

Rep: Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501
do not use sed|grep|awk|cut chains, usually it can be solved with a single sed or awk or perl or ....
Anyway probably an xml parser would be a better idea.
http://stackoverflow.com/questions/4...ng-shellscript
 
1 members found this post helpful.
Old 07-10-2014, 06:40 AM   #3
ndc85430
Member
 
Registered: Apr 2014
Distribution: Slackware
Posts: 92

Rep: Reputation: Disabled
Yeah, I'd also go with something meant for parsing XML (like Python's ElementTree, but there are undoubtedly many choices).
 
1 members found this post helpful.
Old 07-29-2014, 04:33 AM   #4
santosh0782
Member
 
Registered: Nov 2013
Posts: 132

Original Poster
Rep: Reputation: Disabled
Quote:
Originally Posted by pan64 View Post
do not use sed|grep|awk|cut chains, usually it can be solved with a single sed or awk or perl or ....
Anyway probably an xml parser would be a better idea.
http://stackoverflow.com/questions/4...ng-shellscript
provided link is very helpfull, thanks a lot :-)
 
Old 07-29-2014, 06:29 AM   #5
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian/ubuntu/suse ...
Posts: 8,644

Rep: Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501Reputation: 2501
glad to help you
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extarct tags with multiline values from XML file using sed/Awk gbms Linux - Newbie 3 03-27-2012 10:18 AM
iter around same tags in xml with awk frambau Programming 15 02-10-2012 06:28 AM
Python: Extract names and values from HTML tags Dogs Programming 2 02-10-2011 08:56 AM
Extract Data between XML tags aharrison Linux - Newbie 13 11-17-2010 07:28 PM
XML Schema - redifinition of tags Omni Programming 2 09-20-2006 10:48 AM


All times are GMT -5. The time now is 06:06 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration