Download your favorite Linux distribution at LQ ISO.
Go Back > Forums > Linux Forums > Linux - Newbie
User Name
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!


  Search this Thread
Old 10-05-2012, 05:30 AM   #1
Registered: Aug 2010
Location: Zhongli, Taoyuan
Distribution: slackware, windows, debian (armv4l GNU/Linux)
Posts: 425
Blog Entries: 28

Rep: Reputation: 2
sed regex match

I don't get why this matches the text of the upper but not the lower example
torrentlink=($(sed -rn '/target=/s/.*href="([^"]+)".*>.*\.torrent.*/\1/p' "$html"))
				torrentname=($(sed -rn "/target=/s/.*>(.*)\.torrent.*/\1/p" "$html"))
Text to be matched:
<a href="forum.php?mod=attachment&amp;aid=NzkxNzl8ZGZhMjViZjB8MTM0OTQzMjkyOXw5MDM3MnwxODkxNTQ%3D" target="_blank">[DMG][Dog Days`][04][1280x720][BIG5].mp4.torrent</a>
<a href="forum.php?mod=attachment&amp;aid=ODIzNDV8Nzc2OTFhZmR8MTM0OTQzMjk2N3w5MDM3MnwxNzM5ODI%3D" target="_blank">[dmfans][Shirokuma_Cafe][27][848480][BIG5].rmvb.torrent</a>
The two are completely identical in terms of the matching but it matches only the first but not the second line, I wonder if this is some kind fo a bug.
Old 10-05-2012, 09:19 AM   #2
Registered: Dec 2006
Distribution: RHEL Debian
Posts: 42

Rep: Reputation: 15
I might be wrong but I think that by default sed will only match the first instance! try adding a g on the end to make it a global search!!!
Old 10-07-2012, 02:45 PM   #3
David the H.
Bash Guru
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,826

Rep: Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973Reputation: 1973
You've marked this solved, but didn't explain why. Did you figure it out? And it would've probably helped if you posted the output you got, as well as what you wanted to get.

As far as I can see, the regex isn't a problem, really. But the way you're setting the arrays is. The command substitution will split on whitespace (unless you reset IFS to avoid it). But that would affect the first line and not the second.

I suggest using mapfile instead in any case (assuming bash). It's much safer and cleaner.

We can probably also simplify the sed expressions a bit.

mapfile -t torrentlink < <( sed -rn '/target=/ { s/.*href="// ; s/".*//p }' <"$html" )
mapfile -t torrentname < <( sed -rn '/target=/ { s/[^>]+>// ; s/[.]torrent<.*//p }' <"$html" )

#test print the lines:
printf '(%s)\n' "${torrentlink[@]}"
printf '(%s)\n' "${torrentname[@]}"
This is the output I get for the above:

([DMG][Dog Days`][04][1280x720][BIG5].mp4)
1 members found this post helpful.


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off

Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Using grep or sed to return a regex match davee Linux - General 7 08-02-2011 02:48 AM
[SOLVED] sed regex get multiple string match in array? ted_chou12 Programming 3 04-09-2011 03:16 AM
output the occurence number in sed or grep results in every regex match mbaste2 Linux - General 3 04-06-2011 01:58 AM
Help with sed regex to match words via a pattern. logar0 Linux - Newbie 3 10-24-2010 04:33 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 05:10 AM > Forums > Linux Forums > Linux - Newbie

All times are GMT -5. The time now is 06:30 PM.

Main Menu
Write for LQ is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration