LinuxQuestions.org
LinuxAnswers - the LQ Linux tutorial section.
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices

Reply
 
Search this Thread
Old 10-10-2007, 05:31 PM   #1
xmrkite
Member
 
Registered: Oct 2006
Location: California, USA
Distribution: Mint 16, Lubuntu 14.04, Mythbuntu 14.04, Kubuntu 13.10, Xubuntu 10.04
Posts: 542

Rep: Reputation: 30
Use sed to find and replace a url


Hello. Sed is a tough learn.

I need to take several files each with a bunch of urls in them and get rid of parts of the url.

In the code of the files, it reads something to the effect of:
Code:
<a href='http://www.yahoo.com/here-is-testpage-this-is-the-page.aspx'>
<a href='http://www.yahoo.com/here-is-goodpage-this-is-the-page.aspx'>
<a href='http://www.yahoo.com/here-is-badpage-this-is-the-page.aspx'>
I need to end up with just
Code:
testpage
goodpage
badpage
So i need to get rid of the
Code:
<a href='http://www.yahoo.com/
and the
Code:
here-is-
and then the
Code:
-this-is-the-page.aspx'>
Currently, i open the files up in gedit and do find and replace, where i find "here-is-" and replace it with nothing, so that deletes it.

There must be a way to use sed. I want to write a few scripts to do this automatically so that i don't have to manually do this. (there are a lot of files to do this on)
 
Old 10-10-2007, 06:22 PM   #2
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,978
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
Something like this?

Code:
$ cat test.html                                             
<a href='http://www.yahoo.com/here-is-testpage-this-is-the-page.aspx'>
<a href='http://www.yahoo.com/here-is-goodpage-this-is-the-page.aspx'>
<a href='http://www.yahoo.com/here-is-badpage-this-is-the-page.aspx'>
$ sed -r "s@^.+here-is-(.+)-this-is-the-page.aspx'>@\1@" test.html
testpage
goodpage
badpage
$
If this does what you want, just do
Code:
find -type f -name \*.html -exec sed -r -i "s@^.+here-is-(.+)-this-is-the-page.aspx'>@\1@"  {} \;
and it will "fix" all files *html in the current directory and
all subdirs.


Cheers,
Tink

Last edited by Tinkster; 10-10-2007 at 06:24 PM.
 
Old 10-10-2007, 06:56 PM   #3
xmrkite
Member
 
Registered: Oct 2006
Location: California, USA
Distribution: Mint 16, Lubuntu 14.04, Mythbuntu 14.04, Kubuntu 13.10, Xubuntu 10.04
Posts: 542

Original Poster
Rep: Reputation: 30
You are the man! That worked exactly how i needed it. If only i understood why?
-Thanks
 
Old 10-10-2007, 07:14 PM   #4
syg00
LQ Veteran
 
Registered: Aug 2003
Location: Australia
Distribution: Lots ...
Posts: 12,279

Rep: Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028Reputation: 1028
It probably ain't "sed" that's the tough learn, it's regex.
Plenty of threads here recommending tutorials - but it's still a tough slog when you start.
 
Old 10-10-2007, 07:20 PM   #5
Tinkster
Moderator
 
Registered: Apr 2002
Location: in a fallen world
Distribution: slackware by choice, others too :} ... android.
Posts: 22,978
Blog Entries: 11

Rep: Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879Reputation: 879
Quote:
Originally Posted by xmrkite View Post
You are the man! That worked exactly how i needed it. If only i understood why?
-Thanks
Thanks for the praise ;}

Which bit is giving trouble? Happy to elaborate :}



Cheers,
Tink
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off


Similar Threads
Thread Thread Starter Forum Replies Last Post
sed - find and replace command bullshit Programming 9 01-05-2006 03:25 AM
SED;find and replace;help required gd13 Programming 3 12-21-2004 06:33 AM
Sed, and how to get the file name from an URL the who Programming 2 06-08-2004 01:20 PM
[sed] replace string? chuanyung Programming 3 03-11-2004 08:42 PM
Sed - suitable to replace CR LF? J_Szucs Programming 3 05-12-2003 06:03 PM


All times are GMT -5. The time now is 07:39 AM.

Main Menu
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
identi.ca: @linuxquestions
Facebook: linuxquestions Google+: linuxquestions
Open Source Consulting | Domain Registration