Linux - Software This forum is for Software issues.
Having a problem installing a new program? Want to know which application is best for the job? Post your question in this forum. |
Notices |
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
Are you new to LinuxQuestions.org? Visit the following links:
Site Howto |
Site FAQ |
Sitemap |
Register Now
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
|
 |
03-25-2009, 07:41 PM
|
#1
|
Member
Registered: Jun 2007
Distribution: Debian Jessie, Bunsenlabs
Posts: 586
Rep:
|
Want to download web pages so I can open offline
Use Firefox 3.0.7
Use GNU Wget 1.11.4
I have a question about downloading web pages. If I download with web page complete will I be able to open the pages without being on line, or will there be some pages that I will still need to log in for.
If a web browser is not sufficient, is there some command I can use with Wget to accomplish this?
Thanks.
|
|
|
03-25-2009, 07:54 PM
|
#2
|
LQ Newbie
Registered: May 2007
Location: Pennsylvania, USA
Distribution: Ubuntu Studio 9.10
Posts: 29
Rep:
|
If all you want is a page or two, then saving the page as "Web page, Complete" will work fine. You'll have all the HTML and images and such, making the page look (almost) exactly as it did online. The links on the webpage will still point to their original targets, meaning the online pages. So if you download two pages that are supposed to link together, you'll have to manually change the HTML of the links.
If you're looking to download a whole lot of pages from one site or something like that, you'll need something called an "offline reader". I'm really unfamilar with them, so I can't really offer much advice.
Hope that helps!
|
|
|
03-25-2009, 08:22 PM
|
#3
|
Senior Member
Registered: Feb 2002
Location: harvard, il
Distribution: Ubuntu 11.4,DD-WRT micro plus ssh,lfs-6.6,Fedora 15,Fedora 16
Posts: 3,233
|
if you don't mind command line tools then wget and cURL are options
|
|
|
03-25-2009, 08:54 PM
|
#4
|
LQ Veteran
Registered: Nov 2005
Location: Annapolis, MD
Distribution: Mint
Posts: 17,809
|
It depends on the web page.....if it is going out to different sites to get things, then it will still need a connection to do that. I am sure there are some sites where it would be essentially impossible to get everything downloaded so the you would not need a connection.
Read the manual on wget--it has many options, most of which I have not even begun to tackle.
|
|
|
03-25-2009, 10:25 PM
|
#5
|
Member
Registered: Nov 2006
Location: Melbourne Australia
Distribution: Centos, RHEL, Debian, Ubuntu, Mint
Posts: 128
Rep:
|
oh, wget is the tool, and a fun one! (many a gig of bandwidth wasted while learning that gem!)
Code:
wget --mirror --convert-links --html-extension http://www.gnu.org/
Where http://www.gnu.org/ is the URL of the website. It will download the site (the whole site, so watch it) into the current directory, named: www.gnu.org/
Read that manual though, because there are literally thousands of possible configurations with all the options available!
http://www.gnu.org/software/wget/manual/wget.html
|
|
|
03-26-2009, 11:18 AM
|
#6
|
LQ Guru
Registered: Oct 2005
Location: $RANDOM
Distribution: slackware64
Posts: 12,928
|
One of the coolest scripts I've used is curlmirror, which mirrors a website using curl, and is written in perl:
http://curl.haxx.se/programs/curlmirror.txt
|
|
|
All times are GMT -5. The time now is 12:11 PM.
|
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.
|
Latest Threads
LQ News
|
|