LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 09-24-2010, 06:45 AM   #1
Felipe
Member
 
Registered: Oct 2006
Posts: 292

Rep: Reputation: 31
sed and regexp for search in multilines


Hallo:

I've a text file which have a structure like:
<managed-data-source .....
...
name="nameDS"
....
/>
<connection-pool ....
....
name="connectioPool"
....
>
...
</connection-pool>

Can any tell me a regular expression (with sed or grep) to search for the data-sources and the connectionPool?

Tried with:

sed -n -e '/<connection-pool/,/\/>/p' file
works fine with connection-pool, but

sed -n -e '/<managed-data-source/,/>/p' file
doesn't work (the file is bigger, I've resumed, and put it with a echo for testing):

echo ' <managed-data-source connection-pool-name="Example Connection Pool" jndi-name="jdbc/OracleDS" name="OracleDS"/>
<managed-data-source login-timeout="15" connection-pool-name="mds1PoolDS" jndi-name="jdbc/mds1DS" name="mds1"/>
<managed-data-source login-timeout="15" connection-pool-name="mds2PoolDS" jndi-name="jdbc/mds2DS" name="mds2DS"/>
<managed-data-source login-timeout="15" connection-pool-name="mds3PoolDS" jndi-name="jdbc/mds3DS" name="mds3DS"/>
<managed-data-source login-timeout="15" connection-pool-name="mds4PoolDS" jndi-name="jdbc/mds4DS" name="mds4DS"/>

<connection-pool name="Example Connection Pool">
<managed-data-source login-timeout="15" connection-pool-name="mds2PoolDS" jndi-name="jdbc/mds2DS" name="mds5DS"/>

<connection-pool name="mds2PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" max-connect-attempts="5" max-connections="50" min-connections="5" initial-limit="5" used-connection-wait-timeout="30" lower-threshold-limit="10" time-to-live-timeout="300" property-check-interval="90" validate-connection="true" validate-connection-statement="select 1 from dual">
<managed-data-source
login-timeout="15"
connection-pool-name="PrrrPoolDS"
jndi-name="jdbc/PrrrDS"
name="PrrrDS"/>
' | sed -n '/<managed-data-source/,/>/p'


If you execute the previous command, you will see that it also displays connection-pool lines.

Why? or how?

Thanks
 
Old 09-24-2010, 07:23 AM   #2
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

Your posted structure and the echo example are not the same.

Could you post or attach a relevant part (or parts) of the text file you are using?
 
Old 09-24-2010, 07:34 AM   #3
Felipe
Member
 
Registered: Oct 2006
Posts: 292

Original Poster
Rep: Reputation: 31
You are right. I've tried to extract a piece of the file....

But the command is fine. In that echo what I try is to extract the <managed-data-source .... /> using the filter sed -n '/<managed-data-source/,/>/p'. But it returns some text which I don't hope.
If you execute it you'll see that shows <connection-pool...> when that is not in the filter no?

What's wrong?

Thanks
 
Old 09-24-2010, 07:48 AM   #4
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

Maybe you don't understand what I'm saying in my previous post: Post or attach the original input file here. Without it we cannot help you because the given examples in your first post are not the same.

BTW:
The sed command used in your echo example does what it is asked. When setting a range (/<managed-data-source/,/>/) sed is "greedy". It shows all from the first <managed-data-source it finds to the very last > it finds, which is everything in your echo example.
 
Old 09-24-2010, 07:55 AM   #5
Felipe
Member
 
Registered: Oct 2006
Posts: 292

Original Poster
Rep: Reputation: 31
OK,

But the problem is that I want the shortest, I mean, the first ">", not the last.
¿What can I do to find <managed-data-source and the firs occurrence of ">"

Thanks
 
Old 09-24-2010, 08:09 AM   #6
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Would you be so kind to do as I asked in post number 2 and number 4. I'm not going to ask again.........

If you don't post/attach/upload the original input file we cannot and will not help you.
 
Old 09-27-2010, 03:48 AM   #7
Felipe
Member
 
Registered: Oct 2006
Posts: 292

Original Poster
Rep: Reputation: 31
Here is the file:

I'm looking for filters for:
- Search for a managed-data-source by name (ej: name="Apl2DS").
- Search for a connection-pool by name (ej: name="Apl1PoolDS").
- Look for all managed-data-sources.
- Look for all connection-pool.

For listing all managed-data-source I use a filter like:


sed -e "/<managed-data-source/,/[^>]*>/p", but it doesn't work.


Any idea?
Thanks


<?xml version = '1.0' encoding = 'UTF-8'?>
<data-sources xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="http://xmlns.oracle.com/oracleas/schema/data-sources-10_1.xsd" schema-major-version="10" schema-minor-version="1">
<managed-data-source login-timeout="15" connection-pool-name="Apl1PoolDS" jndi-name="jdbc/Apl1PoolDS" name="Apl1DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl2PoolDS" jndi-name="jdbc/gis" name="Apl2DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl3PoolDS" jndi-name="jdbc/Apl3DS" name="Apl3DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl4PoolDS" jndi-name="jdbc/Apl4DS" name="Apl4DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl5PoolDS" jndi-name="jdbc/Apl5DS" name="Apl5DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl6PoolDS" jndi-name="jdbc/Apl6DS" name="Apl6DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl7PoolDS" jndi-name="jdbc/Apl7DS" name="Apl7DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl8PoolDS" jndi-name="jdbc/Apl8DS" name="Apl8DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Apl9PoolDS" jndi-name="jdbc/Apl9DS" name="Apl9DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Ap10PoolDS" jndi-name="jdbc/Ap10DS" name="Ap10DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Ap11PoolDS" jndi-name="jdbc/Ap11DS" name="Ap11DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Ap12PoolDS" jndi-name="jdbc/Ap12DS" name="Ap12DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Ap13PoolDS" jndi-name="jdbc/Ap13DS" name="Ap13DS"/>
<managed-data-source login-timeout="15" connection-pool-name="Ap14PoolDS" jndi-name="jdbc/Ap14DS" name="ArsiDS"/>
<connection-pool name="Apl1PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" max-connect-attempts="5" max-connections="25" min-connections="5" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.driver.OracleDriver" user="Apl1" password="clave1" url="jdbc:racle:thin:@database1.com:1521:EIC1">
<property name="v$session.program" value="Apl1PoolDS"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl2PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" max-connect-attempts="5" max-connections="25" min-connections="5" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.driver.OracleDriver" user="Apl2" password="Apl2" url="jdbc-racle:thin:@database2.com:1521:rcl">
<property name="v$session.program" value="Apl2PoolDS"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl3PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl3" password="Apl3" url="jdbc:racle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl3PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl4PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="25" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl4" password="clave3" url="jdbc:racle:thin:@//database3.com:1521/PRcl01">
<property name="connectionCacheName" value="Apl4PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl5PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl5" password="Apl5" url="jdbc:racle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl5PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl6PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl6" password="Apl6" url="jdbc:racle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl6PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl7PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl7" password="Apl7" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl7PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl8PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl8" password="Apl8" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl8PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Apl9PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Apl9" password="Apl9" url="jdbcacle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Apl9PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Ap10PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Ap10" password="Ap10" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Ap10PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Ap11PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Ap11" password="Ap1s" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Ap11PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Ap12PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="apl3" password="apl33" url="jdbcracle:thin:@//database4.com:1521/orcl3">
<property name="connectionCacheName" value="Ap12PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Ap13PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Ap13" password="Ap13" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Ap13PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
<connection-pool name="Ap14PoolDS" abandoned-connection-timeout="90" connection-retry-interval="30" inactivity-timeout="90" initial-limit="5" lower-threshold-limit="10" max-connect-attempts="5" max-connections="50" min-connections="5" property-check-interval="90" time-to-live-timeout="300" used-connection-wait-timeout="30" validate-connection="true" validate-connection-statement="select 1 from dual">
<connection-factory factory-class="oracle.jdbc.pool.OracleDataSource" user="Ap14" password="Ap14" url="jdbcracle:thin:@//database2.com:1521/orcl">
<property name="connectionCacheName" value="Ap14PoolDS"/>
<property name="connectionCachingEnabled" value="true"/>
<property name="fastConnectionFailoverEnabled" value="true"/>
</connection-factory>
</connection-pool>
</data-sources>
 
Old 09-27-2010, 04:07 AM   #8
kurumi
Member
 
Registered: Apr 2010
Posts: 223

Rep: Reputation: 45
you should really use an XML parser. Here's a Ruby example ( similarly with other languages and their XML libraries)

Code:
#!/usr/bin/env ruby -w
# Ruby 1.9.1

require 'rexml/document'
include REXML
ret = File.open("file").read
xml= Document.new(ret)
xml.elements.each("*/managed-data-source") do |element|
    puts element if element.attributes["name"] == "Apl2DS"
end
....
....

Last edited by kurumi; 09-27-2010 at 04:08 AM.
 
Old 09-27-2010, 05:43 AM   #9
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371Reputation: 2371
Hi,

All the managed dat source entries are on the same line, no multi line sed is needed:
Quote:
- Look for all managed-data-sources
sed -n '/<managed-data-source/p' inputfile

Quote:
- Search for a managed-data-source by name (ej: name="Apl2DS")
sed -n '/<managed-data-source/{/name="Apl2DS"/p}' inputfile

The following are a lot harder to do with sed. This because not all entries have the same amount of lines (do have a look at kurumi's suggestion!).
The answers below assume that every <connection-pool ... entry has 7 lines (not true for the first few in your example!!)
Quote:
- Search for a connection-pool by name (ej: name="Apl1PoolDS")
sed -n '/<connection-pool name="Ap13PoolDS"/,+6p' inputfile

Quote:
- Look for all connection-pool
sed -n '/<connection-pool /,+6p' inputfile

Parsing xml (and html) files isn't easy due to the possible differences in the layout. Perl and Ruby do have xml parsers that could be of help.

Hope this helps.
 
Old 09-27-2010, 07:48 AM   #10
Felipe
Member
 
Registered: Oct 2006
Posts: 292

Original Poster
Rep: Reputation: 31
Thank you.

I suppose I'll have to use a parser as now I only have to filter but finally I'll have to modify data.

The problem is I'm creating shell scripts and don't know perl or Ruby. I'll try to find an easy xml parser to manage it from shell scripts.

Regards
 
Old 09-27-2010, 07:58 AM   #11
GrapefruiTgirl
Guru
 
Registered: Dec 2006
Location: underground
Distribution: Slackware64
Posts: 7,594

Rep: Reputation: 543Reputation: 543Reputation: 543Reputation: 543Reputation: 543Reputation: 543
At a glance, my opinion is that gawk (awk) would be better suited to a task like this, even though as you see, so far sed is doing the job. As mentioned/implied above, parsing markup languages can be tricky.

Just in case you might be interested, I figured I'd point you to `xgawk` or `XMLgawk`, which is just what it sounds like: awk, for parsing XML; here's the homepage: http://home.vrweb.de/~juergen.kahrs/gawk/XML/

Note that I haven't played with it in some time, but it did work well when I last tried it, and it's only gotta be better now.

Good luck!
 
  


Reply

Tags
regex


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Migrate Regexp from SED to AWK cgcamal Programming 9 04-23-2010 10:32 PM
Regexp: difference between sed and Perl matiasar Programming 2 10-15-2009 11:03 AM
help with sed / regexp elinenbe Programming 2 02-01-2008 10:09 AM
AND OR NOT and regexp in the search bar Tischbein LQ Suggestions & Feedback 1 10-15-2006 09:32 AM
regexp search for [ wijnands Linux - Newbie 3 06-22-2004 02:15 AM


All times are GMT -5. The time now is 04:20 PM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration