LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   wget will not download full webpage with images (https://www.linuxquestions.org/questions/linux-software-2/wget-will-not-download-full-webpage-with-images-415950/)

hedpe 02-15-2006 11:13 PM

wget will not download full webpage with images
 
Hi guys,

I need to figure out how to download a full page with all images. I tried using wget but it does not seem to work, it keeps only downloading just index.html

The man page states:
[code]
o Retrieve only one HTML page, but make sure that all the elements needed for the page to be displayed, such as inline images and external style sheets, are
also downloaded. Also make sure the downloaded page references the downloaded links.

wget -p --convert-links http://www.server.com/dir/page.html

The HTML page will be saved to www.server.com/dir/page.html, and the images, stylesheets, etc., somewhere under www.server.com/, depending on where they
were on the remote server.
[/wget]

So, i try and only get the index:
Code:

gnychis@monster /tmp/test $ wget -p --convert-links http://www.microsoft.com         
--00:10:13--  http://www.microsoft.com/
          => `www.microsoft.com/index.html'
Resolving www.microsoft.com... 207.46.18.30, 207.46.19.30, 207.46.19.60, ...
Connecting to www.microsoft.com|207.46.18.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22,330 (22K) [text/html]

100%[=============================================================================================================================================>] 22,330        65.06K/s           

00:10:14 (64.99 KB/s) - `www.microsoft.com/index.html' saved [22330/22330]


FINISHED --00:10:14--
Downloaded: 22,330 bytes in 1 files
Converting www.microsoft.com/index.html... 0-74
Converted 1 files in 0.001 seconds.

any suggestions?

Thanks!
George

alunduil 02-15-2006 11:36 PM

Add the -r option for recursive. Read the man page of wget for more info.

Regards,

Alunduil

hedpe 02-15-2006 11:46 PM

that didn't seem to do it

Code:

hedlinux www.microsoft.com # wget -p --convert-links -r -x http://www.microsoft.com       
--00:46:07--  http://www.microsoft.com/
          => `www.microsoft.com/index.html'
Resolving www.microsoft.com... 207.46.199.30, 207.46.198.60, 207.46.20.30, ...
Connecting to www.microsoft.com|207.46.199.30|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 22,329 (22K) [text/html]

100%[==============================================================================================================================================>] 22,329        88.64K/s           

00:46:08 (88.61 KB/s) - `www.microsoft.com/index.html' saved [22329/22329]

Loading robots.txt; please ignore errors.
--00:46:08--  http://www.microsoft.com/robots.txt
          => `www.microsoft.com/robots.txt'
Reusing existing connection to www.microsoft.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 1,365 (1.3K) [text/plain]

100%[==============================================================================================================================================>] 1,365        --.--K/s           

00:46:08 (48.21 MB/s) - `www.microsoft.com/robots.txt' saved [1365/1365]

--00:46:08--  http://www.microsoft.com/default.aspx
          => `www.microsoft.com/default.aspx'
Reusing existing connection to www.microsoft.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 22,329 (22K) [text/html]

100%[==============================================================================================================================================>] 22,329        --.--K/s           

00:46:08 (3.56 MB/s) - `www.microsoft.com/default.aspx' saved [22329/22329]


FINISHED --00:46:08--
Downloaded: 46,023 bytes in 3 files
Converting www.microsoft.com/index.html... 1-74
Converting www.microsoft.com/default.aspx... 1-74
Converted 2 files in 0.007 seconds.



All times are GMT -5. The time now is 12:03 AM.