LinuxQuestions.org
Latest LQ Deal: Latest LQ Deals
Go Back   LinuxQuestions.org > Forums > Linux Forums > Linux - Software
User Name
Password
Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum.

Notices


Reply
  Search this Thread
Old 02-15-2006, 11:13 PM   #1
hedpe
Member
 
Registered: Jan 2005
Location: Boston, MA
Distribution: Debian
Posts: 380

Rep: Reputation: 30
wget will not download full webpage with images


Hi guys,

I need to figure out how to download a full page with all images. I tried using wget but it does not seem to work, it keeps only downloading just index.html

The man page states:
[code]
o Retrieve only one HTML page, but make sure that all the elements needed for the page to be displayed, such as inline images and external style sheets, are
also downloaded. Also make sure the downloaded page references the downloaded links.

wget -p --convert-links http://www.server.com/dir/page.html

The HTML page will be saved to www.server.com/dir/page.html, and the images, stylesheets, etc., somewhere under www.server.com/, depending on where they
were on the remote server.
[/wget]

So, i try and only get the index:
Code:
gnychis@monster /tmp/test $ wget -p --convert-links http://www.microsoft.com           
--00:10:13--  http://www.microsoft.com/
           => `www.microsoft.com/index.html'
Resolving www.microsoft.com... 207.46.18.30, 207.46.19.30, 207.46.19.60, ...
Connecting to www.microsoft.com|207.46.18.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22,330 (22K) [text/html]

100%[=============================================================================================================================================>] 22,330        65.06K/s             

00:10:14 (64.99 KB/s) - `www.microsoft.com/index.html' saved [22330/22330]


FINISHED --00:10:14--
Downloaded: 22,330 bytes in 1 files
Converting www.microsoft.com/index.html... 0-74
Converted 1 files in 0.001 seconds.
any suggestions?

Thanks!
George
 
Old 02-15-2006, 11:36 PM   #2
alunduil
Member
 
Registered: Feb 2005
Location: San Antonio, TX
Distribution: Gentoo
Posts: 684

Rep: Reputation: 62
Add the -r option for recursive. Read the man page of wget for more info.

Regards,

Alunduil
 
Old 02-15-2006, 11:46 PM   #3
hedpe
Member
 
Registered: Jan 2005
Location: Boston, MA
Distribution: Debian
Posts: 380

Original Poster
Rep: Reputation: 30
that didn't seem to do it

Code:
hedlinux www.microsoft.com # wget -p --convert-links -r -x http://www.microsoft.com        
--00:46:07--  http://www.microsoft.com/
           => `www.microsoft.com/index.html'
Resolving www.microsoft.com... 207.46.199.30, 207.46.198.60, 207.46.20.30, ...
Connecting to www.microsoft.com|207.46.199.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22,329 (22K) [text/html]

100%[==============================================================================================================================================>] 22,329        88.64K/s             

00:46:08 (88.61 KB/s) - `www.microsoft.com/index.html' saved [22329/22329]

Loading robots.txt; please ignore errors.
--00:46:08--  http://www.microsoft.com/robots.txt
           => `www.microsoft.com/robots.txt'
Reusing existing connection to www.microsoft.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 1,365 (1.3K) [text/plain]

100%[==============================================================================================================================================>] 1,365         --.--K/s             

00:46:08 (48.21 MB/s) - `www.microsoft.com/robots.txt' saved [1365/1365]

--00:46:08--  http://www.microsoft.com/default.aspx
           => `www.microsoft.com/default.aspx'
Reusing existing connection to www.microsoft.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 22,329 (22K) [text/html]

100%[==============================================================================================================================================>] 22,329        --.--K/s             

00:46:08 (3.56 MB/s) - `www.microsoft.com/default.aspx' saved [22329/22329]


FINISHED --00:46:08--
Downloaded: 46,023 bytes in 3 files
Converting www.microsoft.com/index.html... 1-74
Converting www.microsoft.com/default.aspx... 1-74
Converted 2 files in 0.007 seconds.
 
  


Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
WGET: How do I cancel a download? PionexUser Linux - Software 3 12-06-2005 12:30 PM
wget wont download ?? paul_mat Linux - Software 1 11-01-2005 12:45 AM
how to invoke Mozilla webpage in full screen when it is run taoweijia Programming 5 02-21-2004 04:45 PM
Stalled WebPage Load/No Download DigitMole Mandriva 2 02-02-2004 05:09 PM
what to do with 5 parts of wget download Bruce Hill Linux - Software 2 09-11-2003 10:47 AM

LinuxQuestions.org > Forums > Linux Forums > Linux - Software

All times are GMT -5. The time now is 05:44 PM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration