ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Thanks al2.. Its helps me a lot to server my job..
Moreover.. chrism01.. can you help me in the 2nd link.. its a zip file which when I saved.. then the path is not correctly mentioned or what.. no link is opening... can you guide me... how to work with http://rute.2038bug.com/index.html.gz
Thanks al2.. Its helps me a lot to server my job..
Moreover.. chrism01.. can you help me in the 2nd link.. its a zip file which when I saved.. then the path is not correctly mentioned or what.. no link is opening... can you guide me... how to work with http://rute.2038bug.com/index.html.gz
Thanks anyway to both of you for you assistance
This opens on my system with no problems. (Firefox 2.0.0.11)
I've never seen the .gz extension on html pages....
This is weird: If I save the page to disk and then try to open it, it offers only to open with Ark. This fails, as does double-clicking. I also cannot open the saved file from within FF (even though FF opens it from the site just fine).
If I rename the file with just the .html extension, it opens normally.
AND--if I try to put it in the address line without the .gz, it gives me this:
Quote:
The HTML on this web site is compressed with gzip. A web browser which automatically decompresses these web pages is Mozilla. Note that old version of Internet Explorer do not perform this action. Loading will start in 5 seconds.
Conclusion: When I save the page, it is no longer compressed, but it keeps the .gz extension.
Solution: save and rename the file, or use Firefox/Mozilla
This is weird: If I save the page to disk and then try to open it, it offers only to open with Ark. This fails, as does double-clicking. I also cannot open the saved file from within FF (even though FF opens it from the site just fine).
If I rename the file with just the .html extension, it opens normally.
It’s not that weird. The reason is that file extensions are immaterial to web browsing. Most likely, the file extension is there to tell the server how to treat the file. In this case, the .gz extension tells the server to use web compression to send the file if the client advertises itself with “Accept-Encoding: gzip” (which Mozilla browsers do). So the compressed content is transmitted by the server to the web browser where it is decompressed and interpreted as “Content-type: text/html” (all transparent to the user). When you try to “Save Page” through the webbrowser, the decompressed version will be saved, with whatever filename you chose (including file extension).
If you try to open the file later on (you are no longer web browsing, but file browsing), there is no server to tell the browser how to interpret the file (i.e., give it a Content-type). The only hint it has is the file extension (even if the extension misrepresents the file). So if you have an html text file with a different extension (e.g., .gz), the browser will assume (without looking at the contents of the file) that the MIME-type corresponds to that extension (e.g., application/x-gzip) instead of text/html.
If you try this sort of thing in something like wget, you will have to decompress the file yourself. E.g.,
Code:
$ wget http://rute.2038bug.com/index.html.gz
--15:52:40-- http://rute.2038bug.com/index.html.gz
=> `index.html.gz'
Resolving rute.2038bug.com... 196.15.148.250
Connecting to rute.2038bug.com|196.15.148.250|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20,378 (20K) [text/html]
100%[=============================================================================>] 20,378 25.72K/s
15:52:42 (25.67 KB/s) - `index.html.gz' saved [20378/20378]
$ file index.html.gz
index.html.gz: gzip compressed data, was "T", from Unix, last modified: Fri Oct 19 17:28:13 2007
$ gunzip index.html.gz
$ file index.html
index.html: HTML document text
As you can see, the file is sent to be interpreted as text/html even though it is encoded with gzip.
OK--replace "weird" with "I've never seen it before." (In thousands of hours of web browsing.)
Is the browser response to a g-zipped file in the W3C standards?
When Firefox saves a file with the .gz extension--but which is no longer a g-zip file--is that standards-compliant? If so, find me the people that maintain these standards.....
OK--replace "weird" with "I've never seen it before." (In thousands of hours of web browsing.)
You probably have seen web compression, it’s just usually done completely transparently (e.g., even if the stream is compressed, you won’t notice any extensions in the URI).
Quote:
Originally Posted by pixellany
Is the browser response to a g-zipped file in the W3C standards?
The browser (when browsing through http) doesn’t (or shouldn’t) have a clue of the file types corresponding to various paths (i.e., it shouldn’t try to interpret extensions). What it should do is send a request to the server (consisting mostly of a host and an absolute path), and interpret the response based on the entity headers (again not by looking at the extension). For example, if the headers indicate the Content-Type to be “application/x-gzip” with no Content-Encoding, then the browser should use its application/x-gzip handler (usually saving the file to disk or opening a temporary copy with an external program). This is what happens with most URIs which end in “.gz” (e.g., http://www.kernel.org/pub/linux/kern...-2.6.23.tar.gz). If the headers indicate the Content-Type to be “text/html” with a Content-Encoding of “gzip”, then the browser should decompress the stream and handle the result with the text/html handler (which is usually the browser itself). This is what happened with the RUTE URI. The choice of the headers themselves is made by the server (not the browser).
For example, some servers chose to send files which end in “.c” over as “text/plain”, and others chose to use “text/x-csource” or “application/octet-stream”. The difference often means the difference in a browser showing the contents of the file or opening up the default editor to show the contents. A similar thing happens for postscript files (e.g., if sent as “text/plain” you are shown the source code for the postscript file, but if sent as “application/postscript” it is opened by a postscript viewer). If I wanted, I could mismatch all the file extensions on my server, as long as I told you how to use them correctly. The only time the browser will try to read the extensions itself is when the Content-Type is given as “application/octet-stream”.
All this is in the “standards”. Mostly it is in the HTTP standard (RFC 2616).
Quote:
Originally Posted by pixellany
When Firefox saves a file with the .gz extension--but which is no longer a g-zip file--is that standards-compliant? If so, find me the people that maintain these standards.....
Well on my firefox, this is what happens: I go to “File->Save Page As” and a subsequent save dialog is given. It asks me where to save it and what the filename should be. The default place to save it is my home directory, and the default filename is usually the basename of the absolute path portion of the URI. I don’t think the W3C or any other standards body cares what choice of default filename a browser uses. I am not even aware that the W3C mandates a browser to have “Save Page” functionality.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.