[SOLVED] Software that can pack external resources into a HTML file
ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Software that can pack external resources into a HTML file
I'll rather ask here than in Software forum, but if you think that place is a better one feel free to move the topic.
I have a small HTML+JavaScript+CSS project consisting of:
1. a single HTML file
2. few external JavaScript source code files
3. few external CSS files
4. few external image files
All files mentioned in 2., 3., 4. are used in the HTML file. What software (has to bee free software and run on Linux) can I use to feed it all these goodies and produce a single HTML file with all resources bundled (and JS code minified)?
I know that I can:
1. minify JavaScript source code files using standalone minifier software (I don't know any particular names yet, because I did not research that deep)
2. include JavaScript code into a HTML file using <script> tag
3. include CSS code into a HTML file using <style> tag
4. include base64 encoded images into a HTML file using data URLs
However, all these would be manual steps, and since this is an evolving project, I'll need to do this multiple times.
That's why I'm looking for some one-run command line utility that can do everything at once, if such a thing exists. Since this seems to be a generic task, I hope it does exist. Since I'm not very familiar with webdev tech, I don't know any relevant keywords to search for, hence my question in this forum.
I wrote something similar for myself alas it does not handle JavaScript nor CSS (and it requires my forked version of htmlmin). Perhaps it would be a good starting point if any existing solutions don’t fit your needs.
IMO what the OP describes is what being a web developer is...
Incorporate javascript and css into a web page with the appropriate tags. Change the referenced scripts if required...no need to change the tags in the web page.
Example: Building a site with several pages. All have the same "look and feel" controlled with css and/or javascript.
Later, some change in the "look and feel" is required...so the css and/or javascript files are changed; no need to touch the html pages.
Organize the project to facilitate any changes with minimal coding where you can. Use a CVS system if there are multiple developers.
Apparently the topic is even broader than I thought initially. For example in my preliminary planning I missed that I can also minify the HTML code itself. I have spent a couple of hours doing searches and I learned that there's a lot of tools for minifying HTML/JavaScript/CSS, I'm sure that after I try a couple of them I'll be able to settle on something. Thanks for the pointers, marking the thread solved.
^ nevertheless, may i ask why you think you need to do this?
faster loading of web pages?
you know there's nothing wrong with loading resources form multiple files. i don't think it's slower than loading the same amount of information from one file, or at least the added overhead is orders of magnitude smaller than the loading itself.
also, - i do not know this, i'm asking - are base64 encoded images the same size as the (original) images, or is the resulting text file larger?
^ nevertheless, may i ask why you think you need to do this?
faster loading of web pages?
Dunno about OP but my use case is to create a stand-alone HTML file which I can share by simply mailing it to people.
Quote:
Originally Posted by ondoho
you know there's nothing wrong with loading resources form multiple files. i don't think it's slower than loading the same amount of information from one file, or at least the added overhead is orders of magnitude smaller than the loading itself.
Loading the same amount of bytes from one file is likely faster than doing that from multiple files. As such, embedding CSS or JavaScript into a page may lead to decrease in load time and amount of data transferred.
HTTP request and response have overhead so fetching a 1k file will easily result in transferring multiple k of data back and forth. This is why even embedding binary data using data: scheme may result in reduction of data transferred since increase in size due to base64 encoding (see below) may be offset by reduction in amount of HTTP chatter.
Furthermore, making an HTTP request takes time so even if you were to send more data the page might load faster because one (or multiple) round trips are averted.
This all changes of course if you start embedding data which is shared by multiple pages. Or if the embedded data rarely changes compared to HTML page. In those situations, unless for some bizarre reason you’re optimising for the first time person opens the page (at the cost of subsequent visits), embedding will lead to increase in load time since browser won’t be able to utilise cache. As such, embedding resources into HTML code of a regular web page is therefore rarely (if ever) a good idea, but there are other use cases which I’ve mentioned above, so creating a stand-alone HTML file is not necessarily a bad idea.
(And then there’s HTTP/2 with multiplexed channels and Server Push which is yet another can of worms).
Quote:
Originally Posted by ondoho
also, - i do not know this, i'm asking - are base64 encoded images the same size as the (original) images, or is the resulting text file larger?
base64 encodes three source bytes as four output bytes; by itself it results in exactly 33.4% increase. MIME encoding does not apply to embedding images on a web page, and overhead of data: scheme is just several bytes so it usually can be neglected. This is muddled if server compresses data sent to the client since then the increase in size may suddenly disappear (alas at a cost of CPU and possibly latency unless server has version of the file already compressed ready).
base64 encodes three source bytes as four output bytes; by itself it results in exactly 33.4% increase.
does this take into account that most text files are UTF-8-encoded, where every character displayed is at least 2 bytes (at least for latin alphabet languages)?
does this take into account that most text files are UTF-8-encoded, where every character displayed is at least 2 bytes (at least for latin alphabet languages)?
Yes it does and your statement is false. In UTF-8 and latin alphabet every character is just one byte. UTF-8 encodes ASCII characters as single bytes and since base64 uses only ASCII characters there is no additional overhead.
You may be confusing UTF-8 with UTF-16 which is an abomination of an encoding and please never use it. (I’m looking at you, Java).
PS. You might find my blog post about Unicode interesting as it touches on UTF-8 vs. UTF-16 debacle.
^ nevertheless, may i ask why you think you need to do this?
faster loading of web pages?
My goal is to produce a single HTML file for end users. It is easier to distribute a single HTML file that can just be opened in the web browser by double clicking on it than having an archive containing a directory tree (at least a subdirectory for images) and multiple files. Unfortunately, even in 2018 there are users who can't handle archives.
Quote:
Originally Posted by ondoho
also, - i do not know this, i'm asking - are base64 encoded images the same size as the (original) images, or is the resulting text file larger?
AFAIK, binary data (image files being that as well) grows in size when being base64 encoded, since base64 uses a smaller alphabet than ASCII.
But I hope to compensate that by compressing JavaScript and CSS files (and probably HTML, as I have learned from this thread), since I don't have that many image files.
My goal is to produce a single HTML file for end users. It is easier to distribute a single HTML file that can just be opened in the web browser by double clicking on it than having an archive containing a directory tree (at least a subdirectory for images) and multiple files. Unfortunately, even in 2018 there are users who can't handle archives.
You might wanna check out CHM files, though I don’t recall which browsers support it.
My goal is to produce a single HTML file for end users. It is easier to distribute a single HTML file that can just be opened in the web browser by double clicking on it than having an archive containing a directory tree (at least a subdirectory for images) and multiple files. Unfortunately, even in 2018 there are users who can't handle archives.
Now I'm curious.
"Distribute" an HTML file? How are you "distributing" the file? And why?
Is the file dynamic...that is, is the content specific to the receiving user?
Why use HTML at all? I generally use either RTF or PDF to send specific information to individuals (login instructions for an on-line survey, for example).
I'm wondering exactly what you're trying to accomplish. As I say, just being nosy...ultimately it's none of my business.
My goal is to produce a single HTML file for end users. It is easier to distribute a single HTML file that can just be opened in the web browser by double clicking on it than having an archive containing a directory tree (at least a subdirectory for images) and multiple files. Unfortunately, even in 2018 there are users who can't handle archives.
Make an EPUB.
An archive containing HTML files and their assets (such as images, stylesheets, etc) is exactly what an EPUB is.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.