Linux - NewbieThis Linux forum is for members that are new to Linux.
Just starting out and have a question?
If it is not in the man pages or the how-to's this is the place!
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Introduction to Linux - A Hands on Guide
This guide was created as an overview of the Linux Operating System, geared toward new users as an exploration tour and getting started guide, with exercises at the end of each chapter.
For more advanced trainees it can be a desktop reference, and a collection of the base knowledge needed to proceed with system and network administration. This book contains many real life examples derived from the author's experience as a Linux system and network administrator, trainer and consultant. They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own.
Click Here to receive this Complete Guide absolutely free.
I'm using this command:
wget --convert-links -np --wait=20 --limit-rate=20K -m -k -p -U Mozilla -c http://www.thekneeslider.com/ -O kneetest.html
I can only use konqueror (asus eeepc 701) to completely view
The image taimoshan-front-qtr.jpg does not show, but if I right click and save, I can view the jpg.
I was able to see this jpg earlier in the day, but not now.
I've wget'd the site about 5 or 6 times in a day and sometimes I'll miss 2 images.
I've also tried without the -U Mozilla option.
BTW: Is there a way to get Mozilla to read files from wget?
What do you mean exactly with "get Mozilla to read files from wget"? wget is just a download tool. If it manages to fetch all of the files of a site and convert the html links properly, then you should be able to view them in any browser. If something can't be seen then it's because it's failing to do one or the other of the above.
But while it's possible to use wget to do simple recursive site mirroring, it's really not its main purpose. I recommend using httrack instead, which is designed specifically for downloading whole sites, including fetching regular updates.
Just open Konqueror, and in the "file" menu, select "open file." If that don't work, it is broken.
If I use Mozilla SavePageAs, I get a separate directory of jpg's etc.. With wget, I get only one html file. I assume wget compresses jpg's etc. and keeps a 'link' for off-line somewhere. When I open the html downloaded with wget, I can view almost everything from the page. Not so with Mozilla-Firefox.
Konqueror does open Firefox, so I assume Konqueror is pre-processing? the file.
Online,I can open the local file and see everything ,using ff or konq., so online 'links' work.
You seem to be a bit unclear about what wget is supposed to be able to do. You can't use it to create a single file that has both html and images in it (at least not one that works, afaik). Either it will download all the embedded images as separate files and save them to a directory structure on your system, or it will download only the basic html in a single file, and all the images will remain on the web. In no way does wget "compress and store" the images anywhere, except in the standard mirrored directory stucture, above.
Next, the command you posted doesn't even seem to be correct.
"-k" and "--convert-links" are the same option, and thus redundant. But trying to run it with that option gives me an error that it's incompatible with -O. Which makes sense because -O directs all content into a single file, but -k is designed to correct all the links to point to the files in the directory structure I mentioned above. I'm sure there are other problems as well, but I'm not an expert with wget.
Also, konqueror in file-browsing mode doesn't do any "pre-processing". It simply opens up files in whatever programs are associated with them. And if used in web-browser mode, it will simply render the page just like any other browser would.
So in the end, I don't know exactly why your images aren't loading, but it may be because of some bad url formatting or a browser cache problem or something, because I'm pretty sure that no actual images are being stored on your system and the links are still pointing to their web-based locations.
Finally, there is one way I know to save everything in a single file: the mozilla archive format extension. It bundles everything into a zip-file based archive that can be viewed in firefox.
I agree, I'm not clear where the images are or what -k really means. Thanks for looking at the command. I started out with a different command and kept adding more options out of desperation. I know I can see images offline only with konqueror. These images must be cached somewhere, but I can't find them. I'm using python to automate some bookmark fetches and keep viewable html's offline. I've looked at add-ons for browsers , but I keep trying to come up with complete files that I can tar.
When I load a html in konqueror (launched from python server) I can see usr/bin/firefox/ lines, so it looks like konqueror uses firefox for something??
No, konqueror does not use firefox/mozilla in any way. It's a completely separate program with a completely different and distinct rendering system. There's no interaction between them, other than konqueror can be told to open html files in firefox or another browser.
Without seeing those exact html lines you're talking about I can't say what they are. What "html" are you talking about anyway? Are you viewing the page source? The source saved using wget, or with the mozilla "saved page"? And where are the lines displayed? Are you doing this in offline mode? Or are you talking about some output from the python script? To tell the truth, it's kind of hard to understand what you're referring to sometimes. Perhaps you should take the time explain what you're doing doing step-by-step.
Most probably when the images load it's because whenever you open a page in a browser copies of the images are usually stored in the browser cache. So chances are it's simply re-loading images that have already been saved. But when you open the same page up in a different browser, the images haven't been cached yet, and so it needs to download them first. The same goes when the page changes to add a new photo; a copy of it needs to be downloaded into the browser before it can be displayed.
That's what offline mode is, by the way. The browser disconnects from the net and only displays files that have been previously stored in its cache.
You're doing something similar when you save a page to disk using wget or "save page". You're creating a local copy of the web page that can be opened in the browser without connecting to the net. But when you save the page as a "single file", only the html code is saved. The images haven't been stored anywhere in the archiving process, so they still have to be fetched from the net (or from the browser cache if the browser has previously cached them).
In any case I think you're making this too difficult for yourself. If you want pages that work properly, forget about trying to save everything in a single file and simply use the multiple-file directory structure that's the default for saving "complete" pages. Then as long as the links have been converted properly, they should work in any browser.