LinuxQuestions.org
View the Most Wanted LQ Wiki articles.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Newbie
User Name
Password
Linux - Newbie This Linux forum is for members that are new to Linux.
Just starting out and have a question? If it is not in the man pages or the how-to's this is the place!

Notices

Reply
 
Search this Thread
Old 10-05-2012, 06:30 AM   #1
ted_chou12
Member
 
Registered: Aug 2010
Location: Zhongli, Taoyuan
Distribution: slackware, windows, debian (armv4l GNU/Linux)
Posts: 421
Blog Entries: 28

Rep: Reputation: 2
sed regex match


I don't get why this matches the text of the upper but not the lower example
Code:
torrentlink=($(sed -rn '/target=/s/.*href="([^"]+)".*>.*\.torrent.*/\1/p' "$html"))
				torrentname=($(sed -rn "/target=/s/.*>(.*)\.torrent.*/\1/p" "$html"))
Text to be matched:
Code:
<a href="forum.php?mod=attachment&amp;aid=NzkxNzl8ZGZhMjViZjB8MTM0OTQzMjkyOXw5MDM3MnwxODkxNTQ%3D" target="_blank">[DMG][Dog Days`][04][1280x720][BIG5].mp4.torrent</a>
<a href="forum.php?mod=attachment&amp;aid=ODIzNDV8Nzc2OTFhZmR8MTM0OTQzMjk2N3w5MDM3MnwxNzM5ODI%3D" target="_blank">[dmfans][Shirokuma_Cafe][27][848480][BIG5].rmvb.torrent</a>
The two are completely identical in terms of the matching but it matches only the first but not the second line, I wonder if this is some kind fo a bug.
Thanks,
Ted
 
Old 10-05-2012, 10:19 AM   #2
henrycoffin
Member
 
Registered: Dec 2006
Distribution: RHEL Debian
Posts: 42

Rep: Reputation: 15
I might be wrong but I think that by default sed will only match the first instance! try adding a g on the end to make it a global search!!!
 
Old 10-07-2012, 03:45 PM   #3
David the H.
Bash Guru
 
Registered: Jun 2004
Location: Osaka, Japan
Distribution: Debian sid + kde 3.5 & 4.4
Posts: 6,823

Rep: Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950Reputation: 1950
You've marked this solved, but didn't explain why. Did you figure it out? And it would've probably helped if you posted the output you got, as well as what you wanted to get.

As far as I can see, the regex isn't a problem, really. But the way you're setting the arrays is. The command substitution will split on whitespace (unless you reset IFS to avoid it). But that would affect the first line and not the second.

I suggest using mapfile instead in any case (assuming bash). It's much safer and cleaner.

We can probably also simplify the sed expressions a bit.

Code:
mapfile -t torrentlink < <( sed -rn '/target=/ { s/.*href="// ; s/".*//p }' <"$html" )
mapfile -t torrentname < <( sed -rn '/target=/ { s/[^>]+>// ; s/[.]torrent<.*//p }' <"$html" )

#test print the lines:
printf '(%s)\n' "${torrentlink[@]}"
echo
printf '(%s)\n' "${torrentname[@]}"
This is the output I get for the above:
Code:
(forum.php?mod=attachment&amp;aid=NzkxNzl8ZGZhMjViZjB8MTM0OTQzMjkyOXw5MDM3MnwxODkxNTQ%3D)
(forum.php?mod=attachment&amp;aid=ODIzNDV8Nzc2OTFhZmR8MTM0OTQzMjk2N3w5MDM3MnwxNzM5ODI%3D)

([DMG][Dog Days`][04][1280x720][BIG5].mp4)
([dmfans][Shirokuma_Cafe][27][848480][BIG5].rmvb)
 
1 members found this post helpful.
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
[SOLVED] Using grep or sed to return a regex match davee Linux - General 7 08-02-2011 03:48 AM
[SOLVED] sed regex get multiple string match in array? ted_chou12 Programming 3 04-09-2011 04:16 AM
output the occurence number in sed or grep results in every regex match mbaste2 Linux - General 3 04-06-2011 02:58 AM
Help with sed regex to match words via a pattern. logar0 Linux - Newbie 3 10-24-2010 05:33 PM
regex with sed to process file, need help on regex dwynter Linux - Newbie 5 08-31-2007 06:10 AM


All times are GMT -5. The time now is 05:20 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration