LinuxQuestions.org
Download your favorite Linux distribution at LQ ISO.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - General
User Name
Password
Linux - General This Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.

Notices


Reply
  Search this Thread
Old 01-22-2012, 11:58 AM   #1
towheedm
Member
 
Registered: Sep 2011
Location: Trinidad & Tobago
Distribution: Debian Stretch
Posts: 612

Rep: Reputation: 125Reputation: 125
Please help me with this sed command


I'm trying to use sed to extract the URL portion of a line from a file.

The file contains the following line:
Code:
DOWNLOAD="http://ftp.a.b/c/d/file-to-download"
Using this command:
Code:
sed -n 's/^DOWNLOAD="//p' filename
returns:
Code:
http://ftp.a.b/c/d/file"
I can remove the last " character by piping the first command to another sed command:
Code:
sed -n 's/^DOWNLOAD="//p' filename | sed -n 's/"//p'
but there must be a better way.

The second part I have no idea where to even start. I would also like to return the "file-to-download" part of the URL.

Any help is greatly appreciated.
 
Old 01-22-2012, 12:04 PM   #2
sycamorex
LQ Veteran
 
Registered: Nov 2005
Location: London
Distribution: Slackware64-current
Posts: 5,836
Blog Entries: 1

Rep: Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251Reputation: 1251
try:
Code:
sed 's/.*\(http.*\)\("\)/\1/' infile
 
Old 01-22-2012, 12:12 PM   #3
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,

The URL part (alternative for sycamorex code):
Code:
sed 's/DOWNLOAD="\(.*\)"/\1/' infile
The file part:
Code:
sed 's%.*/\(.*\)"%\1%' infile
Hope this helps.
 
1 members found this post helpful.
Old 01-22-2012, 01:10 PM   #4
towheedm
Member
 
Registered: Sep 2011
Location: Trinidad & Tobago
Distribution: Debian Stretch
Posts: 612

Original Poster
Rep: Reputation: 125Reputation: 125
druuna,

That's works. I understand the first part to return the URL. To print out only that line I did:
Code:
sed -n 's/DOWNLOAD="\(.*\)"/\1/p' infile
The part to return the file I don't get. I should have mentioned that 'infile' has several URLs, so the second part returns the file part of each URL. How can I get it to return (and print) only the file part of the line starting with DOWNLOAD=". I'm having a hard time understanding the second part. Is it too much to ask for a little explanation?

Thank you very much.
 
Old 01-22-2012, 01:39 PM   #5
druuna
LQ Veteran
 
Registered: Sep 2003
Posts: 10,532
Blog Entries: 7

Rep: Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405Reputation: 2405
Hi,
Quote:
Originally Posted by towheedm View Post
druuna,

That's works. I understand the first part to return the URL. To print out only that line I did:
Code:
sed -n 's/DOWNLOAD="\(.*\)"/\1/p' infile
The part to return the file I don't get. I should have mentioned that 'infile' has several URLs, so the second part returns the file part of each URL. How can I get it to return (and print) only the file part of the line starting with DOWNLOAD=". I'm having a hard time understanding the second part. Is it too much to ask for a little explanation?
Both my samples assume that DOWNLOAD="http://ftp.a.b/c/d/file-to-download" is all that's on that line and nothing else. If it isn't then both samples will probably work incorrectly.

Please provide us with a relevant sample, otherwise the answers given might not work.

About my second solution:
Code:
sed 's%.*/\(.*\)"%\1%' infile
- The brown part matches all up to and including the last forward slash,
- The green part matches the rest (minus the last ") and is used in the replace part as \1,
- I used % instead of / as separator because I need to match a forward slash.

The blue part will fail if there's more on a line. Sed is greedy and will grab all up to and including the last / it finds.

Hope this helps.

Last edited by druuna; 01-22-2012 at 01:45 PM. Reason: them -> then
 
1 members found this post helpful.
Old 01-22-2012, 02:06 PM   #6
H_TeXMeX_H
LQ Guru
 
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
Blog Entries: 2

Rep: Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301Reputation: 1301
It would help if you posted more of the file, I suspect it is an HTML file, and as such should have tags, which will make it much easier to deal with.
 
Old 01-22-2012, 02:46 PM   #7
towheedm
Member
 
Registered: Sep 2011
Location: Trinidad & Tobago
Distribution: Debian Stretch
Posts: 612

Original Poster
Rep: Reputation: 125Reputation: 125
Ah, I was wondering about the %. I usually use a comma so did not think about it as the separator. Thanks for the explanation. Now:
Code:
sed -n 's%DOWNLOAD=.*/\(.*\)"%\1%p' infile
returns the file part from the line starting with DOWNLOAD=.

BTW: Colors don't show in code or do I need to change some setting. Forget that...gotta get my eyes checked or get a new monitor. :-)

The file is actually the .info file from a SlackBuild archive. The format is:
Code:
PRGNAME="program-name"
VERSION="program-version"
HOMEPAGE="http://home/page/to/program/archive"
DOWNLOAD="http://link/to/program/archive/program-name"
MD5SUM="MD5 checksum value for program-name"
DOWNLOAD_xX86_64=""
MD5SUM_x86_64=""
MAINTAINER="Name of maintainer"
EMAIL="email address of maintainer"
APPROVED="Name of person who approved the SlackBuild script"
Thanks for all your help.

Last edited by towheedm; 01-22-2012 at 02:48 PM.
 
Old 01-22-2012, 06:55 PM   #8
Cedrik
Senior Member
 
Registered: Jul 2004
Distribution: Slackware
Posts: 2,140

Rep: Reputation: 244Reputation: 244Reputation: 244
You can also use basename to extract the file name from the url

Code:
URL=$(sed -n 's/DOWNLOAD="\(.*\)"/\1/p' filename)
FILE=$(basename $URL)

echo "url: $URL"
echo "file: $FILE"

Last edited by Cedrik; 01-22-2012 at 06:58 PM.
 
1 members found this post helpful.
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
Modifying text file with "one command line" SED command... daleo Linux - Newbie 3 01-13-2012 05:32 AM
sed command help viveksnv Programming 3 02-26-2008 07:40 AM
sed command ancys Programming 3 08-05-2006 09:30 AM
sed command help... Pete.Hanson@jacobs.c Programming 8 06-02-2006 05:53 PM
sed command kwigibo Linux - General 3 04-21-2002 04:11 PM

LinuxQuestions.org > Forums > Linux Forums > Linux - General

All times are GMT -5. The time now is 09:15 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration