I have multiple computers in the household, only one of which gets the internet. What I need to do is be able to take a whole site (usually an FAQ, a guide, a free online book, etc) mirror the whole thing on my hard drive, where I can tarball and gzip it, floppy copy it, and move it to the machine where I'll be needing it on hand. Preferably quickly, because everybody in the house uses the internet machine.
Now, I tried using the "getlinks" script from Wicked Cool Shell Scripts ( here:
http://www.intuitive.com/wicked/ ) , combined with a script I wrote (posted at my blog:
http://hackersnest.modblog.com/?show...blog_id=615755 ), and it actually worked for a couple sites. Unfortunately, the entire internet features hundreds of different site-indexing methods, each incompatible with this method in their own unique way, and I'm constantly re-writing this script over and over to deal with each site's quirks. It seems like every time I find a more general-purpose solution, three more exceptions are discovered which break it!
Now, I have
http://www.slackware.com/book/ , which uses some kind of scheme so that even the lynx -source | getlinks script combo doesn't work. Has anybody ever found an all-purpose, one-shot tool for Linux to do this?
My distros? I use Red Hat 9.0, Slackware 10.1, Debian 3.1 (barely), D*mn Small Linux version-I-forget, Knoppix Live CD 3.7 and Mepis Zeddy. The one with the internet connection is Red Hat/dual booted with Lose^H^H^H^H Win98.
PS I dont care about getting pictures/whistles/bells/etc. Just the plain 'ol text would be fine.
PPS edit: I got lucky and found the .org site's link to download the tarball, so the specific case is over...but I still need the general case solution!