Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
02-11-2007, 12:43 AM
|
#1
|
Member
Registered: Nov 2005
Location: Land of Linux :: Finland
Distribution: Pop!_OS && Windows 10 && Arch Linux
Posts: 831
|
[Grep,Awk,Sed]Parsing text between XML tags.
Hello, I have a little problem with my bash script, I need to get all text between those blue XML tags with awk,sed or grep.
Code:
<com_section>
<com_create_instance inprocserver32="%SystemRoot%\system32\shdocvw.dll" interfaceid="{000214E6-0000-0000-C000-000000000046}"/>
<com_get_class_object inprocserver32="C:\WINDOWS\system32\urlmon.dll" interfaceid="{00000001-0000-0000-C000-000000000046}"/>
</com_section>
<dll_handling_section>
<load_dll dll="c:\910ac0d71833d902c1a824c0335761eb.exe" successful="1"/>
<load_dll dll="C:\WINDOWS\system32\ntdll.dll" successful="1"/>
<load_dll dll="C:\WINDOWS\system32\kernel32.dll" successful="1"/>
<load_dll dll="VERSION.dll" successful="1"/>
<snip>1-50 lines removed to make it easier to read.</snip>
</dll_handling_section>
<filesystem_section>
<delete_file filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<copy_file filetype="File" srcfile="c:\910ac0d71833d902c1a824c0335761eb.exe" dstfile="C:\WINDOWS\system32\algs.exe" creationdistribution="CREATE_ALWAYS" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<set_file_attributes filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="FILE_ATTRIBUTE_HIDDEN,SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<open_file filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" creationdistribution="OPEN_EXISTING" desiredaccess="FILE_ANY_ACCESS" shareaccess="SHARE_WRITE" flags="FILE_ATTRIBUTE_NORMAL,SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<set_file_time filetype="File" srcfile="C:\WINDOWS\system32\algs.exe" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<delete_file filetype="File" srcfile="bonbw.bat" desiredaccess="FILE_ANY_ACCESS" flags="SECURITY_ANONYMOUS" fileinformationclass="FileBasicInformation"/>
<snip>1-50 lines removed to make it easier to read.</snip>
</filesystem_section>
<mutex_section>
<create_mutex name="dcf7d2f7071938ba83b50c70eedd5ceb8984" owned="0"/>
<create_mutex name="CTF.LBES.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Compart.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Asm.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.Layouts.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.TMD.MutexDefaultS-1-5-21-1645522239-706699826-839522115-1003" owned="0"/>
<create_mutex name="CTF.TimListCache.FMPDefaultS-1-5-21-1645522239-706699826-839522115-1003MUTEX.DefaultS-1-5-21-16455222" owned="0"/>
<create_mutex name="ZonesCounterMutex" owned="0"/>
<create_mutex name="ZonesCacheCounterMutex" owned="0"/>
<create_mutex name="ZonesLockedCacheCounterMutex" owned="0"/>
</mutex_section>
I have tried several different approaches but I can only get text between tags if those tags are in the same line like this:
Code:
<com_section>foo</com_section>
But if those tags are in different lines I do not know how to make it work:
Code:
<com_section>
foo
</com_section>
Any hints ?
TIA,
///////
|
|
|
02-11-2007, 01:24 AM
|
#2
|
Member
Registered: Nov 2005
Location: Land of Linux :: Finland
Distribution: Pop!_OS && Windows 10 && Arch Linux
Posts: 831
Original Poster
|
Ooops.
Sorry, I found solution for this one
Code:
sed -n '/<com_section>/,/<\/com_section>/p'
|
|
|
02-15-2007, 03:44 PM
|
#3
|
Member
Registered: Sep 2005
Location: Nummela, Southern Finland
Distribution: Fedora Core 4, Fedora Core 5, Redhat 9.0, Solaris 10, WInXP pro, W2k pro
Posts: 30
Rep:
|
Oh dear...
You, my friend, just solved one of my biggest problem. I had a task to find RewriteRules from very complex Apache config (huge amount virtualhosts, but few needed, with gigantic rewriterule sets within).
I really do appreciate that you send this information about solving your problem...
It is not common (what a pity) that people share their solutions.
Thank you VERY MUCH!
|
|
|
01-20-2009, 02:49 AM
|
#4
|
LQ Newbie
Registered: Jan 2009
Distribution: Red hat
Posts: 1
Rep:
|
I had data where the open and close tags could be on multiple lines or on the same line but no nested tags. This solution using awk worked
awk -F'[<|>]' '/Testcase/{print $3}
Last edited by sxjthefirst; 01-20-2009 at 02:50 AM.
|
|
|
07-26-2011, 11:38 AM
|
#5
|
LQ Newbie
Registered: Jul 2011
Posts: 1
Rep: 
|
Hello,
i have a similar problem, where i want to check the data between the 2 tags <filesystem_section> </filesystem_section> and check the value: if delete_file filetype="File" then replace the values of
copy_file filetype="New_File1" and
set_file_attributes filetype="New_File2" . <filesystem_section> </filesystem_section> tags exist mutiple times. how we can we do this?
Thanks
|
|
|
07-26-2011, 11:54 AM
|
#6
|
LQ 5k Club
Registered: Aug 2005
Distribution: OpenSuse, Fedora, Redhat, Debian
Posts: 5,399
|
Neither sed nor grep alone or in combination are up to the task. Awk might give you a fighting chance, but is not ideal.
Use Perl and one of the mature XML parsers written for it. Search CPAN for details. Don't try to re-invent that particular wheel unless you're convinced that you can improve upon it (and since you're asking the question here, that seems unlikely).
--- rod.
|
|
|
All times are GMT -5. The time now is 08:29 AM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|