LinuxQuestions.org
Help answer threads with 0 replies.
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices

Reply
 
Search this Thread
Old 10-23-2007, 06:58 PM   #1
hansschmucker
LQ Newbie
 
Registered: Oct 2004
Posts: 27

Rep: Reputation: 15
Question Solved: BASH get marked RegEx result: "foo s bar" | /foo(.*)bar/ 1 -> " s "


Hi everybody,

I did my best to try to describe my need in the title and hopefully somebody who already knows the answer is able to give me an answer.

A more lenghty description of my problem.

I have an input string which I get from a cURLed website:

Quote:
"One Two One Two One Two <div>Thread Title:Hello World</div> One Two One Two One Two"
Now, I want to get "Hello World" from this.

I can get the it including the "<div>Thread Title: ... </div>" part using pcregrep:

Code:
echo "$content" | pcregrep -o -e "<div>Thread Title:.*?<\/div>"
-> <div>Thread Title:Hello World</div>

But how can I only get Hello World? In Javascript I'd do
Code:
("One Two One Two One Two <div>Thread Title:Hello World</div> One Two One Two One Two").exec(/<div>Thread Title:(.*?)<\/div>/g)[1]
-> Hello World

But is there a tool that lets me do that on BASH? SED doesn't seem to be able to output anything but whole lines...

something like
Code:
echo $data|regexec "/<div>Thread Title:(.*?)<\/div>/" "1"
would be great!

Thank you in advance
Hans Schmucker
Mannheim
Germany

Last edited by hansschmucker; 10-23-2007 at 07:38 PM.
 
Old 10-23-2007, 07:04 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,965
Blog Entries: 11

Rep: Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865
Just use the match as a replacement string ...

Code:
echo $data|sed -r "/<div>Thread Title:(.*?)<\/div>/\1/g"
I'm not 100% certain whether sed knows the ? quantifier, if it
doesn't, try

Code:
echo $data|sed -r "/<div>Thread Title:([^<]+)<\/div>/\1/g"

Cheers,
Tink
 
Old 10-23-2007, 07:10 PM   #3
hansschmucker
LQ Newbie
 
Registered: Oct 2004
Posts: 27

Original Poster
Rep: Reputation: 15
Hmmm.... that doesn't work .... Sed complains about an unknown character "\", probably because there's no command, did you mean

Code:
echo $data|sed -r "s/<div>Thread Title:([^<]+)<\/div>/\1/g"
because that works, however it still prints the full line...

Last edited by hansschmucker; 10-23-2007 at 07:12 PM.
 
Old 10-23-2007, 07:26 PM   #4
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,965
Blog Entries: 11

Rep: Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865Reputation: 865
Errrh ... that was what I meant, and I'm having a blonde day :D
Code:
echo $data|sed -r "s/.*<div>Thread Title:([^<]+)<\/div>.*/\1/g"
Try that
 
Old 10-23-2007, 07:29 PM   #5
hansschmucker
LQ Newbie
 
Registered: Oct 2004
Posts: 27

Original Poster
Rep: Reputation: 15
Ah you're matching against the whole line and then replacing .... clever, I didn't think of that...

Thank you very much and a special thanx for your patience
 
Old 10-23-2007, 08:33 PM   #6
angrybanana
Member
 
Registered: Oct 2003
Distribution: Archlinux
Posts: 147

Rep: Reputation: 21
the expr command would also do this:
Code:
$ expr "$data" : ".*<div>Thread Title:\(.*\)<\/div>.*"
Hello World
My favorite way for something like that is Perl (if it's an option)

Code:
echo $data|perl -lne 'print $1 if /<div>Thread Title:(.*?)<\/div>/'
Hello World

Last edited by angrybanana; 10-23-2007 at 09:33 PM.
 
Old 10-23-2007, 10:34 PM   #7
hansschmucker
LQ Newbie
 
Registered: Oct 2004
Posts: 27

Original Poster
Rep: Reputation: 15
I found another interesting option, which is my favourite so far. I've found an archived EXE build (yeah, I'm under Windows right now, and while I'm running bash, mencoder and hundreds of other Linux applications that still means that building applications is a pain, so I have to resort to builds created by somebody else) of Spidermonkey (that's Mozilla Javascript engine). It's only 500k and has virtually no dependencies.
http://209.85.135.104/search?q=cache...e%3DJavaScript

All I need to do is something like this:
js -e "print((/Hello(.*?)World/).exec('Hello You World')[1]);"

Not quite as fast as SED, but a lot more comfortable for me...
 
  


Reply

Tags
bash, helpful, regex, sed


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
Sed/Awk: print lines between n'th and (n+1)'th match of "foo" xaverius Programming 17 08-20-2007 11:39 AM
"bar" is a piece of crap........... but PLEEEASE HELP ME!!! sorry "OT" osat3ch Fedora 1 08-23-2004 09:38 PM
"make-kpkg --revision=foo.1.0 kernel_image" gives some errors (kernel 2.6.3) Duukkis Debian 14 05-23-2004 03:58 AM
Pengy practices the martial art of "GNU" foo caleb star Linux - Hardware 2 01-28-2004 01:10 PM
"mkdir: cannot create directory `foo': Read-only file system" on FAT32 maddes Linux - Hardware 1 11-26-2003 06:19 PM


All times are GMT -5. The time now is 12:51 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration