LinuxQuestions.org
Visit Jeremy's Blog.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 04-12-2009, 12:34 PM   #1
frenchn00b
Senior Member
 
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561

Rep: Reputation: 57
how to cat only the http:// link into my webpage?


Hello,

I am trying the cat the *.gif of the webpage.

So here how I do :

Code:
wget -k "http://passions.mettavant.fr/pdabureautique.htm"
Code:
sh script.sh pdabureautique.htm
with script.sh
Code:
#!/bin/sh
cat "$1"   | grep -o 'http:[^"]*'
but the problem is that the cat of hte htm page has src wihtout the full link/url ... without http...
Code:
...

  <tr>
      <td bgcolor="#F3F2FD" valign="top">
        <div align="center"><font face="Tahoma" size="1"><a href="#note"><img src="Images/pda/icones/carnetouvert.gif" width="34" height="34" border="0" style="filter:alpha(opacity=40)" onMouseover="makevisible(this,0)" onMouseout="makevisible(this,1)"></a><br>
          bloc notes<br>
          note taker - pad</font></div>

...
How to make it ??

I am blocked unspawn, the king, any ideas?

thank you!
 
Old 04-12-2009, 02:43 PM   #2
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
what does "cat the *.gif" mean???

what you want to do is exactly what the -k option does, and after testing it myself with the page you linked, it works just fine. does your version of wget actually support -k?
 
Old 04-13-2009, 03:23 AM   #3
frenchn00b
Senior Member
 
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561

Original Poster
Rep: Reputation: 57
Quote:
Originally Posted by acid_kewpie View Post
what does "cat the *.gif" mean???

what you want to do is exactly what the -k option does, and after testing it myself with the page you linked, it works just fine. does your version of wget actually support -k?
I would like to wget teh page first.

then "cat" (list) allllll, completely all, the http links/url that are in the webpage

as u can see in this example, the gif files are not seen or listed by my script.sh

that's bit difficult to make it

any ideas?

best regards and happy eastern
 
Old 04-13-2009, 03:34 AM   #4
Su-Shee
Member
 
Registered: Sep 2007
Location: Berlin
Distribution: Slackware
Posts: 510

Rep: Reputation: 53
You want a combination of wget and lynx.

Lynx has the very handy feature of "lynx -dump" (just dumping _all_ links of a webpage) and lynx -image-links (dumping all image links).

So, of you want to fetch all images from webpage XY, you usally have to do a combination like this:

Code:
lynx -dump -image-links http://www.somewebsite.foo | \
for img in `egrep -o ".*\.jpg"`; \
do `wget -x  $img`; \
done
Check the website with lynx first, adjust the egrep-stuff and stuff it into wget and wrap a loop around it.

To just list all links of a page, just do lynx -dump.

I really do _love_ the power of the command line.

Last edited by Su-Shee; 04-13-2009 at 03:36 AM.
 
Old 04-13-2009, 03:38 AM   #5
acid_kewpie
Moderator
 
Registered: Jun 2001
Location: UK
Distribution: Gentoo, RHEL, Fedora, Centos
Posts: 43,417

Rep: Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985Reputation: 1985
as I said, wget's -k option is exactly what you want, so if that's not what you're seeing you're doing it wrong or your version of wget doesn't support -k properly.
 
Old 04-13-2009, 11:21 PM   #6
frenchn00b
Senior Member
 
Registered: Jun 2007
Location: E.U., Mountains :-)
Distribution: Debian, Etch, the greatest
Posts: 2,561

Original Poster
Rep: Reputation: 57
Quote:
Originally Posted by acid_kewpie View Post
as I said, wget's -k option is exactly what you want, so if that's not what you're seeing you're doing it wrong or your version of wget doesn't support -k properly.
thanks!

hmmm interesting... let's check it:
Code:
wget -k "http://passions.mettavant.fr/pdabureautique.htm"
Ok
Code:
$ wget --version
GNU Wget 1.11.4

Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GN
my try was without wget -k, apparently. :?
I get it working now with wget -k

So it seems we are getting few solutions.
I guess I also did try curl when I had access to my box... but it did do that -k ... dont recall
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
How do I link to odt/docx files in a webpage? matthewhardwick Programming 5 12-09-2006 06:41 PM
Firefox extension: Get webpage by http. (Non UTF-8) apepost Programming 0 03-15-2006 02:40 PM
ln -s link to webpage. Sapient Linux - Newbie 17 01-30-2006 02:43 PM
Wireless Network Link Established - No webpage though mrh7184 Linux - Wireless Networking 6 10-04-2005 11:49 PM
[C/C++] HTTP-Protocol; Getting the size of a webpage. quentusrex Programming 6 10-27-2004 08:29 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 11:11 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration