LinuxQuestions.org
Review your favorite Linux distribution.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices


Reply
  Search this Thread
Old 04-03-2012, 03:21 AM   #1
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Rep: Reputation: 0
Unhappy extracting a string form log output.


Hi Guys,

I am looking for a quick hack to extract "Success" from following log file.

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
^M100 493 100 493 0 0 12832 0 --:--:-- --:--:-- --:--:-- 12832^M100 493 100 493 0 0 12611 0 --:--:-- --:--:-- --:--:-- 0
<?xml version="1.0" encoding="UTF-8"?>
<env:Envelope xmlns:env="http://www.w3.org/2003/05/soap-envelope" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:ns0="urn:myorganization:glide:services:FlowControlService:1:0"><env:Body><ns0:getActivationAge ntPropertyValueResponseElement><ns0:status>Success</ns0:status><ns0:value>$1</ns0:value><ns0:message xsi:nil="1"/></ns0:getActivationAgentPropertyValueResponseElement></env:Body></env:Envelope>

Essentially.

remembering string between,<ns0:status>Success</ns0:status>

How should we go about it?
 
Click here to see the post LQ members have rated as the most helpful post in this thread.
Old 04-03-2012, 03:38 AM   #2
business_kid
LQ Guru
 
Registered: Jan 2006
Location: Ireland
Distribution: Slackware & Android
Posts: 7,569

Rep: Reputation: 697Reputation: 697Reputation: 697Reputation: 697Reputation: 697Reputation: 697
grep will get the line.

If you want to hack and trim use egrep with a posix or perl regex with appropriate switches. YMMV
man regex - posuix REs
man pcre - perl REs (more powerful imho).
 
Old 04-03-2012, 06:50 AM   #3
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,120

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
How do you like this?
Code:
sed ' /status/ { s/^.*\(<ns0:status>.*<.ns0:status>\).*$/\1/; p }; d ' logfile
 
1 members found this post helpful.
Old 04-03-2012, 07:18 AM   #4
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@pan64

output is very close.

<ns0:status>Success</ns0:status>

All what we need is "Success" instead of whole tag, should it be easy?
 
Old 04-03-2012, 07:22 AM   #5
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,120

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
just move the parentheses
Code:
sed ' /status/ { s/^.*<ns0:status>\(.*\)<.ns0:status>.*$/\1/; p }; d ' logfile
 
1 members found this post helpful.
Old 04-03-2012, 07:25 AM   #6
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@pan64 again very close, how to have a line feed \n after success?

[user01@tmelbld19 ~]$ sed ' /status/ { s/^.*<ns0:status>\(.*\)<.ns0:status>.*$/\1/; p }; d ' logfile
Success[user01@tmelbld19 ~]$

So after "Success" can we have a line break?
 
Old 04-03-2012, 07:31 AM   #7
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,120

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
just put a \n after the \1 part.

sed ' /status/ { s/^.*<ns0:status>\(.*\)<.ns0:status>.*$/\1\n/; p }; d ' logfile
 
1 members found this post helpful.
Old 04-03-2012, 07:41 AM   #8
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@pan64

Dear Sir, Thank you for your help.

Yes it works now, but I am not interested in only solution but also the associated learning with it.

May I please ask the magic behind your regex?

Many Thanks
 
Old 04-03-2012, 07:57 AM   #9
pan64
LQ Guru
 
Registered: Mar 2012
Location: Hungary
Distribution: debian i686 (solaris)
Posts: 8,120

Rep: Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270Reputation: 2270
ok, let's try to explain, but first, here is the man page of the command sed: http://linux.die.net/man/1/sed
you can find the sed script between the ' chars.
/status/ means I want to search for lines containing the text status.
in {} there are two commands to execute on the current line (which should now contain the text status).
The first command is the s, means substitute, the syntax is: s/search text/replace text/. ^ is the beginning of the line, .* means anything, \( and \) means grouping, . means any char and finally $ means end of line. The replacement string is \1 which means the first group found - the text between \( and \). So the full line will be replaced with the grouped text, and a \n is added.
The second command is p which means print the text.
The last command is a d which means I want to delete the line and go to the next one. It will be executed for every line, the /status/ search expression works only for the commands inside {}.
 
2 members found this post helpful.
Old 04-03-2012, 10:10 AM   #10
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@pan64

Thank you very much !
 
Old 04-03-2012, 01:40 PM   #11
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
There's no need to make it that complicated. Just use the "-n" option to silence output by default, then add the "p" modifier directly to the substitute command. That way only lines that match will be printed.

You can also use the "-r" option to explicitly enable extended regex, so that there's no need to escape the parentheses.

Since the "." represents any character, we really don't want to use it when we actually want to match "/". Unfortunately, this is the default delimiter for the "s" command. However, sed allows you to use any ascii character as the delimiter, so just choose one that's not found in the expression itself. I prefer using "|" myself.

Next, the regex only really needs to contain enough of the string to ensure a unique match, and the starting and ending anchors are also superfluous here, as the regex assumes them. Of course, it doesn't hurt to leave them in either.

Finally, it's probably safer to replace the ".*" (a string of characters of any length) with "[^<]*" (a string of "not <" of any length), to avoid any possible issues with regex greediness.

Code:
sed -rn '/status/ s|.*status>([^<]*)</ns0:stat.*|\1|p' logfile
BTW, I don't see where the line-ending issue could be coming from. sed always appends a newline to each line of output anyway.


Here are a few useful sed references.
http://www.grymoire.com/Unix/Sed.html
http://sed.sourceforge.net/grabbag/
http://sed.sourceforge.net/sedfaq.html
http://sed.sourceforge.net/sed1line.txt
 
2 members found this post helpful.
Old 04-03-2012, 06:30 PM   #12
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@David the H

Code:
[user01@tmelbld19 ~]$ ./getEvaluateReadyToRate.sh 2>&1|tee -a panduta.log|sed -rn '/status/ s|.*status>([^<]*)</ns0:stat.*|\1|p'
Success[user01@tmelbld19 ~]$
There is no line break after "Success"... regex and sed seriously confuses me a lot
 
Old 04-03-2012, 08:33 PM   #13
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 23,066
Blog Entries: 11

Rep: Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910Reputation: 910
Quote:
Originally Posted by sysmicuser View Post
@David the H

Code:
[user01@tmelbld19 ~]$ ./getEvaluateReadyToRate.sh 2>&1|tee -a panduta.log|sed -rn '/status/ s|.*status>([^<]*)</ns0:stat.*|\1|p'
Success[user01@tmelbld19 ~]$
There is no line break after "Success"... regex and sed seriously confuses me a lot
As in the earlier post: just slap a \n behind \1
 
1 members found this post helpful.
Old 04-03-2012, 11:18 PM   #14
sysmicuser
Member
 
Registered: Mar 2010
Posts: 332

Original Poster
Rep: Reputation: 0
@Tinkster it works mate !
 
Old 04-05-2012, 11:03 AM   #15
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957Reputation: 1957
Yes, the line-break is an easy fix. But what I didn't understand in my last post was why it wasn't including one to start with. sed works by placing the line into the pattern buffer minus the newline that delimited it, performs it's edits on the buffer contents, then adds a newline back to the output. So I was thinking it always added a newline to the output.

Besides, I got a newline in all of my test runs.

I've figured it out now though.

The reason you didn't get one is because the line operated on is the last one in the file, and there's no final newline after it (pretty much the only place that could happen). So I guess sed only inserts a newline in the output if there was one in the input. Something new learned!
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Extracting integers from a string teresevo1 Programming 3 11-08-2010 09:24 PM
Extracting a Number from a String nilly16 Programming 15 05-26-2009 08:37 AM
C: Extracting part of a string trevorv Programming 3 08-29-2007 05:36 PM
String extracting / string operation Xeratul Linux - General 24 02-13-2007 03:54 PM
extracting more than one value from a string ganninu Programming 16 12-10-2003 04:26 AM


All times are GMT -5. The time now is 08:16 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration