LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 09-11-2009, 02:41 AM   #1
madvicious
LQ Newbie
 
Registered: Sep 2009
Posts: 3

Rep: Reputation: 0
Exclamation help extracting a matching pattern and next lines of match


Hi there,

i'm having some problems just making an awk script (i've tried this way, but other way can be posible for sure), for the next file

file.txt

<register>
<createProfile>
<result>0</result>
<description><![CDATA[OK]]></description>
<msisdn>34661461174</msisdn>
<inputOmvID>1</inputOmvID>
<inputGroupID>-2</inputGroupID>
<ProfileOmvID>1</ProfileOmvID>
<contentID>3365</contentID>
<contentProfileID>3525</contentProfileID>
<chargingProfileTypeId>22</chargingProfileTypeId>
<operationID>201022</operationID>
...

i have to test if <createProfile> is in the file. If it does, then i have to extract the lines

<msisdn>34661461174</msisdn>

and <contentProfileID>3525</contentProfileID>

so i've tried staring with something like this

> awk '/^<createProfile>/{getline;print}' file.txt

but this only print the next line to the matching pattern <createProfile>.

With this script

> awk '/^<createProfile>/ {print NR,$0}' file.txt

i get the line where he regex matches, bu i don't know how to go on to print the registers for <msisdn>34661461174</msisdn> and <contentProfileID>3525</contentProfileID>

The file is always this way of structure, i mean all the tags are in the same position if the first matching pattern is matched.

Thank you for any help
 
Old 09-11-2009, 04:39 AM   #2
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,458

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
You can try with a flag: every time the tag <createProfile> is encountered you switch on the flag. When the tags <msisdn> and <contentProfileID> are encountered they are printed out (or processed). Switch off the flag when it encounters the last tag you need, in the order they appear. E.g. something like:
Code:
/^<createProfile>/ {
  isCreate = 1
}
isCreate && /^<msisdn>/
isCreate && /^<contentProfileID>/ {
  print
  isCreate = 0
}
 
Old 09-11-2009, 05:19 AM   #3
madvicious
LQ Newbie
 
Registered: Sep 2009
Posts: 3

Original Poster
Rep: Reputation: 0
Thanks colucix in another forum i've get this answer that matches my needs


#!/bin/bash

awk '
/createProfile/{f=1}
f && /createProfile/
f && /msisdn/
f && /contentProfileID/
' file.txt

and could match one that one set of XML

Thanks you very much for your kind answer

Best wishes
 
Old 09-11-2009, 05:23 AM   #4
colucix
Moderator
 
Registered: Sep 2003
Location: Bologna
Distribution: CentOS 6.5 OpenSuSE 12.3
Posts: 10,458

Rep: Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941Reputation: 1941
Quote:
Originally Posted by madvicious View Post
#!/bin/bash

awk '
/createProfile/{f=1}
f && /createProfile/
f && /msisdn/
f && /contentProfileID/
' file.txt
Indeed is quite the same solution, except for the "switching off" part.
 
Old 09-11-2009, 05:40 AM   #5
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
If createProfile always precedes the items you want to extract, you can use sed easily as well:
Code:
sed -n '/createProfile/,${ /msisdn/p
                           /contentID/p
                          }' file.txt
<msisdn>34661461174</msisdn>
<contentID>3365</contentID>
If createProfile can appear anywhere, you can save the lines in a variable or array and print them out after the file is read. For sed you could push msisdn lines in the Hold buffer. For awk, you would probably have variables printed out in the END block.
 
Old 09-11-2009, 05:58 AM   #6
madvicious
LQ Newbie
 
Registered: Sep 2009
Posts: 3

Original Poster
Rep: Reputation: 0
thank you jschiwal, it's another great solution too

i'll take it into account too
 
Old 09-12-2009, 07:11 PM   #7
jschiwal
Guru
 
Registered: Aug 2001
Location: Fargo, ND
Distribution: SuSE AMD64
Posts: 15,733

Rep: Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654Reputation: 654
For extracting particular items from xml files, look at using xsltproc. That is what it is designed for.
 
Old 09-12-2009, 07:38 PM   #8
ghostdog74
Senior Member
 
Registered: Aug 2006
Posts: 2,695
Blog Entries: 5

Rep: Reputation: 240Reputation: 240Reputation: 240
Quote:
Originally Posted by madvicious View Post
Thanks colucix in another forum i've get this answer that matches my needs


#!/bin/bash

awk '
/createProfile/{f=1}
f && /createProfile/
f && /msisdn/
f && /contentProfileID/
' file.txt

and could match one that one set of XML

Thanks you very much for your kind answer

Best wishes
you can also combine them
Code:
...
f && /createProfile|msisdn|contentProfileID/
..
 
Old 09-13-2009, 01:01 AM   #9
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 453Reputation: 453Reputation: 453Reputation: 453Reputation: 453
Quote:
Originally Posted by madvicious View Post
Thanks colucix in another forum i've get this answer that matches my needs


#!/bin/bash

awk '
/createProfile/{f=1}
f && /createProfile/
f && /msisdn/
f && /contentProfileID/
' file.txt

and could match one that one set of XML

Thanks you very much for your kind answer

Best wishes
I do not think XML is line oriented. If I'm right, your approach is wrong, because nobody promises items will stay on the same line forever i.e., one day it may become


Code:
<msisdn>
  34661461174
</msisdn>
.

Again, if I'm correct, use a true XML parser.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Adding (not replacing) a pattern match with a similar pattern? b-bri Linux - Newbie 2 08-31-2009 12:36 AM
printing pattern match and not whole line that matches pattern Avatar33 Programming 13 05-06-2009 06:17 AM
irrelevant characters match in PERL pattern matching gaynut Programming 2 08-21-2008 10:52 PM
AWK/SED Multiple pattern matching over multiple lines issue GigerMalmensteen Programming 15 12-03-2006 05:08 PM
awk/gawk/sed - read lines from file1, comment out or delete matching lines in file2 rascal84 Linux - General 1 05-24-2006 09:19 AM


All times are GMT -5. The time now is 03:50 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration