LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 05-17-2013, 02:56 PM   #1
ramsavi
LQ Newbie
 
Registered: May 2013
Posts: 16

Rep: Reputation: Disabled
how to grep with certain condition


hi all,
i have an xml what i have to do is to search for the source id(s1) and if it matches with that in xml then extract the file mask from the name of the file i.e if the file name is
Code:
idr_%YYYY%%MM%%DD%_%N%.idr
then , i want the part after first %
and before last % ie in this case
Code:
%YYYY%%MM%%DD%_%N%
the xml which i am using is like this
Code:
<?xml version="1.0" encoding="UTF-8"?>

<Sources>

  <DatabaseConnection>

    <TNSName>dgid1rr</TNSName>

    <Username>rte</Username>

    <Password>rted1</Password>

  </DatabaseConnection>

  <Logger>

    <LogFilePath>/export/home/dgid1rr/merge/log</LogFilePath>

    <LogFileName>idr_merge_log</LogFileName>

    <LogLevel>Info</LogLevel>

  </Logger>

  <Source id="S1">
i   <Type>Ericsson3GSGSN</Type>
    <Operator>0</Operator>
    <Version>1.0</Version> 
    <FileMask>idr_%YYYY%%MM%%DD%_%N%.idr</FileMask>

    <TimeLag>100</TimeLag>

    <Backup>True</Backup>

    <InputFilePath>/export/home/dgid1rr/cdr1/input</InputFilePath>

    <OutputFilePath>/export/home/dgid1rr/merge/cdr1/output</OutputFilePath>

    <BackupFilePath>/export/home/dgid1rr/merge/cdr1/backup</BackupFilePath>

    <ProcessOrder>FileTimestamp</ProcessOrder>

    <Enable>True</Enable>

    <MaxFiles>3</MaxFiles>

    <MaxRecs>25</MaxRecs>

  </Source>

  <Source id="S2">

    <FileMask>idr_%2N%.idr</FileMask>

    <TimeLag>300</TimeLag>

    <Backup>True</Backup>

    <InputFilePath>/export/home/dgid1rr/merge/cdr2/input</InputFilePath>

    <OutputFilePath>/export/home/dgid1rr/merge/cdr2/output</OutputFilePath>

    <BackupFilePath>/export/home/dgid1rr/merge/cdr2/backup</BackupFilePath>

    <ProcessOrder>OSTimestamp</ProcessOrder>

    <Enable>True</Enable>

    <MaxFiles>2</MaxFiles>

  </Source>

</Sources>
Thanks

Last edited by ramsavi; 05-17-2013 at 03:19 PM.
 
Old 05-17-2013, 03:27 PM   #2
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
You can't use grep on it's own to extract substrings (except in minor cases with the -o option). A tool like sed or awk is better, but still not ideal.
Code:
sed -rn '/id="S1"/,/FileMask/ { /FileMask/s|.*>idr_([^.]+).*|\1|p }' infile.xml
This will work ok as long as the two tags are on separate lines.

The real problem though is that line-and-regex based tools are not really designed to work on xml or html, which have relatively free-form, nested structures. I highlyl recommend using something that has a real xml parser, like perl or xmlstarlet.

Code:
xmlstarlet sel -T -t -m '//Source[@id="S1"]/FileMask' -v 'substring(.,5,18)' -n infile.xml
You just need to understand something about xpath to use it. The substring function I used above assumes the date string is always uniform.

Another choice is to use xmlstarlet to convert the xml into the line-based pyx format, which is better suited for use with regex tools.

Code:
xmlstarlet pyx infile.xml | sed -rn "/^Aid S1/,/^-idr/ s/^-idr_([^.]+).idr/\1/p"
The html-xml-utils package has a program called hxpipe that does a similar conversion, although the output language is slightly different. I've found it to be more forgiving than xmlstarlet on poorly-formed input.
 
Old 05-17-2013, 03:41 PM   #3
parnmatt
Member
 
Registered: Apr 2013
Location: Lancaster
Distribution: Mac OS X
Posts: 38

Rep: Reputation: 7
Code:
<INPUT STREAM> | grep '%[1,2][0,9][0-1,7-9][0-9]%%[0,1][1-9]%%[0-3][0-9]%_%[0-9]%'
Should do the trick to get the lines with that numbering syntax.
However you should really use something more like perl for something like this, or at least another language with better RegEx support, Ruby can be used too.

EDIT:
I've just re-read what you were asking; this only extracts the lines with that syntax, and therefore is not a full solution.

I agree with David the H.'s suggestion.

Last edited by parnmatt; 05-17-2013 at 03:46 PM. Reason: Edit shown
 
Old 05-18-2013, 02:35 AM   #4
ramsavi
LQ Newbie
 
Registered: May 2013
Posts: 16

Original Poster
Rep: Reputation: Disabled
i am using awk like this , it is working fine when value of source id is 1 or any hard coded value but when value of source id is given some array value , it is not working . the code which i am using is like this.....

Code:
#!/bin/bash -xv



val_1=$( sqlplus -s rte/rted2@rel76d2 << EOF

set heading off

select max(istat_id) from cvt_istats;

exit

EOF

)

echo "val_1: $val_1"



nohup ./cvt -f MediationSources.xml &

sleep 60



declare -a arr

arr=($( sqlplus -s rte/rted2@rel76d2 << EOF

set heading off

select source_id from cvt_istats where istat_id > $val_1;

exit

EOF

))

echo "val_2: $arr"



i=0

len=${#arr[@]}

echo $len



while [ $len -gt $i ]

do

  value= cat MediationSources.xml |awk -F'[<>]' '/Source id="${arr[$i]}"/{f=1}f&&/<FileMask>/{sub(/[a-z_]*%/,"%",$3);sub(/%\..*/,"%",$3);print $3;exit}'

  echo "$value file mask pass "

  i=$(( $i + 1 ))

done
~
can you please check wher i am going wrong
 
Old 05-19-2013, 09:32 AM   #5
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Arch + Xfce
Posts: 6,852

Rep: Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037Reputation: 2037
Well, to start with, since you're awk expression is enclosed in single quotes, the shell variables in it will not expand. You generally have to import shell variables into awk variables with the -v option if you want to use them.

But as I said, you really shouldn't be using awk in the first place. Have you given any thought at all to the solutions I mentioned before?


Now to cover a few other points about your script:

1) What kind of output do the sqlplus commands produce? It would help in figuring out the script to know what kind of values it's using.

2) The way you set your array may not be the safest or most efficient method. If the input is newline delimited perhaps you could use mapfile. Or just a simple read if delimited in some other fashion.

3) Use a c-style for loop instead of a while loop, or just use a regular for loop and run it directly on the array.

3a) When using advanced shells like bash or ksh, it's recommended to use [[..]] for string/file tests, and ((..)) for numerical tests. Avoid using the old [..] test unless you specifically need POSIX-style portability.

http://mywiki.wooledge.org/BashFAQ/031
http://mywiki.wooledge.org/ArithmeticExpression

4) Your "value=" setting does not capture the output of your commands. Where's the necessary command substitution brackets?

5) Useless Use Of Cat.

Last edited by David the H.; 05-19-2013 at 09:35 AM.
 
Old 05-20-2013, 05:11 AM   #6
grail
LQ Guru
 
Registered: Sep 2009
Location: Perth
Distribution: Manjaro
Posts: 10,007

Rep: Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192Reputation: 3192
Code:
awk '/id="S1"/{x=1}x && /FileMask/{print gensub(/^[^_]*_|\..*/,"","g");x=0}' infile.xml
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
check Negative egrep condition in an if condition novicunix Programming 5 02-02-2013 12:52 AM
Creating an alias in ksh that uses grep and includes 'grep -v grep' doug248 Linux - Newbie 2 08-05-2012 02:07 PM
Condition in cp/ls | grep (regex, now I have two problems) Freddythunder Linux - Newbie 6 07-06-2012 08:39 AM
trying grep inside script, w/ OR condition in quotes '$A\|$B' wipeout Programming 7 05-20-2011 12:29 AM
Trying to understand pipes - Can't pipe output from tail -f to grep then grep again lostjohnny Linux - Newbie 15 03-12-2009 10:31 PM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 02:56 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration