LinuxQuestions.org - import entire web site

- Linux - Newbie (https://www.linuxquestions.org/questions/linux-newbie-8/)

- - import entire web site (https://www.linuxquestions.org/questions/linux-newbie-8/import-entire-web-site-615536/)

clwhitt

01-22-2008 12:58 PM

import entire web site

What I would like to do is import an entire web site onto my local computer so that I can work with the web site with Bluefish while off line (these are my web sites).
I've done it once, but for the life of me can't remember how I did it. I'm pretty sure is was a command line process that got it done.

Thanks,
Chuck
Ubuntu Gutsy Gibbon

David the H.

01-22-2008 01:03 PM

I suggest httrack. It's a dedicated cli site downloader that's very powerful and flexible. There's also a web interface available for it.

wget is also able to mirror whole sites, but I think it has trouble with certain directory structures or something. At least I've seen posts by people who've had problems before.

pljvaldez

01-22-2008 01:07 PM

A quick google turned up these:

http://linuxreviews.org/quicktips/wget/
http://www.httrack.com/

clwhitt

01-22-2008 01:20 PM

import entire web site

I'll look at httrack and see how that works. I think though, I might have used wget to do the job before (it rings bells), though I just can't remember exactly how I did it.
pljvaldez - what search phrase did you use? I consider myself pretty good at using Google, but spent a couple of frustrating hours trying to find any useful information that was not related to FrontPage.

Chuck

pljvaldez

01-22-2008 01:27 PM

I used "linux download entire website".

The wget article was first, and the httrack one was third.

clwhitt

01-22-2008 01:44 PM

import entire web site

I tried that without the "entire" and with "web site" as separate words, and all I got was a bunch of hits on downloading Live CD's. I tried "import" rather than "download", and a whole bunch of other variations too. Dang it, I was so close!:)
I did use wget before. Following these messages, and knowing what I was looking for, I checked my terminal command line and saw that I had used wget with the recursive switch (-r) to download the website.
I'm still going to read up on httrack to see how it works. BTW, the link to wget you posted was an interesting read in how it can be manipulated to fool other websites that put up obstacles to downloading a website.

Thanks for you help,
Chuck

clwhitt

01-23-2008 06:53 PM

import entire web site

As a follow-up here, I was able to download my web site quickly, completely and easily with wget -r http://www.obabytheboat.com. WebHTTrack on the other hand spent several hours parsing the site and did not get it all before I lost patience. I haven't spent any more time with it yet to figure out where the fault lies. I also have not tried the cli version yet, so I may get to that just for kicks. I'll try to remember to post a report on what I find out.
Thanks everyone for their help.
Chuck

All times are GMT -5. The time now is 07:06 PM.