Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to
LinuxQuestions.org , a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free.
Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please
contact us . If you need to reset your password,
click here .
02-11-2007, 01:43 AM
#1
Member
Registered: Nov 2005
Location: Land of Linux
Distribution: Quad boot :: Windows vista 64-bit | Vector Linux | Slackware 13.0 64-Bit | Linux Mint 7 64-bit
Posts: 127
Thanked: 0
[Grep,Awk,Sed]Parsing text between XML tags.
[
Log in to
get rid of this advertisement]
Hello, I have a little problem with my bash script, I need to get all text between those blue XML tags with awk,sed or grep.
Code:
<com_section>
<com_create_instance inprocserver32="%SystemRoot%\system32\shdocvw.dll" interfaceid="{000214E6-0000-0000-C000-000000000046}"/>
<com_get_class_object inprocserver32="C:\WINDOWS\system32\urlmon.dll" interfaceid="{00000001-0000-0000-C000-000000000046}"/>
</com_section>
<dll_handling_section>
<load_dll dll="c:\910ac0d71833d902c1a824c0335761eb.exe" successful="1"/>
<load_dll dll="C:\WINDOWS\system32\ntdll.dll" successful="1"/>
<load_dll dll="C:\WINDOWS\system32\kernel32.dll" successful="1"/>
<load_dll dll="VERSION.dll" successful="1"/>
<snip>1-50 lines removed to make it easier to read.</snip>
</dll_handling_section>
<filesystem_section>
<delete_file filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<copy_file filetype="File" srcfile="c:\910ac0d71833d902c1a824c0335761eb.exe" dstfile="C:\WINDOWS\system32\algs.exe" creationdistribution="CREATE_ALWAYS" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<set_file_attributes filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="FILE_ATTRIBUTE_HIDDEN,SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<open_file filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" creationdistribution="OPEN_EXISTING" desiredaccess="FILE_ANY_ACCESS" shareaccess="SHARE_WRITE" flags="FILE_ATTRIBUTE_NORMAL,SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<set_file_time filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<delete_file filetype="File" srcfile="bonbw.bat" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<snip>1-50 lines removed to make it easier to read.</snip>
</filesystem_section>
<mutex_section>
<create_mutex name="dcf7d2f7071938ba83b50c70eedd5ceb8984" owned="0"/>
<create_mutex name="CTF.LBES.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Compart.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Asm.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Layouts.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.TMD.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.TimListCache.FMPDefaultS-1-5-21-1645522239-706699826-839522115-1003MUTEX.DefaultS-1-5-21-16455222" owned="0"/>
<create_mutex name="ZonesCounterMutex" owned="0"/>
<create_mutex name="ZonesCacheCounterMutex" owned="0"/>
<create_mutex name="ZonesLockedCacheCounterMutex" owned="0"/>
</mutex_section>
I have tried several different approaches but I can only get text between tags if those tags are in the same line like this:
Code:
<com_section>foo</com_section>
But if those tags are in different lines I do not know how to make it work:
Code:
<com_section>
foo
</com_section>
Any hints ?
TIA,
///////
02-11-2007, 02:24 AM
#2
Member
Registered: Nov 2005
Location: Land of Linux
Distribution: Quad boot :: Windows vista 64-bit | Vector Linux | Slackware 13.0 64-Bit | Linux Mint 7 64-bit
Posts: 127
Thanked: 0
Original Poster
Ooops.
Sorry, I found solution for this one
Code:
sed -n '/<com_section>/,/<\/com_section>/p'
02-15-2007, 04:44 PM
#3
Member
Registered: Sep 2005
Location: Nummela, Southern Finland
Distribution: Fedora Core 4, Fedora Core 5, Redhat 9.0, Solaris 10, WInXP pro, W2k pro
Posts: 30
Thanked: 0
Oh dear...
You, my friend, just solved one of my biggest problem. I had a task to find RewriteRules from very complex Apache config (huge amount virtualhosts, but few needed, with gigantic rewriterule sets within).
I really do appreciate that you send this information about solving your problem...
It is not common (what a pity) that people share their solutions.
Thank you VERY MUCH!
01-20-2009, 03:49 AM
#4
LQ Newbie
Registered: Jan 2009
Distribution: Red hat
Posts: 1
Thanked: 0
I had data where the open and close tags could be on multiple lines or on the same line but no nested tags. This solution using awk worked
awk -F'[<|>]' '/Testcase/{print $3}
Last edited by sxjthefirst; 01-20-2009 at 03:50 AM ..
Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
All times are GMT -5. The time now is 01:13 AM .
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know .
Latest Threads
LQ News
LQ Podcast
LQ Radio