LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   IGNORE, I'm dumb: Better zip than zip -9 (https://www.linuxquestions.org/questions/linux-software-2/ignore-im-dumb-better-zip-than-zip-9-a-4175643874/)

Michael Uplawski 12-08-2018 02:50 AM

IGNORE, I'm dumb: Better zip than zip -9
 
You can read on, but first my solution: Remove all but the very first new-line, remove all white-space between XML-tags and the file will be smaller... imagine.

Good morning.

This is not a time-traveling thread and won't bring you back to the 1990s, meaning: I can compress better than that. And I do not need recommendations for file-compressors, other than those which create zip-files.

But today I generated an OOXML word-processor document and removed all that has no function in that document, xml-code from styles, anything that I do not need in document.xml and all the unnecessary files and sub-folders. In the end I have the folder “word” with two xml-files, styles.xml and document.xml.

Using
Code:

zip -9r new_doc word
my result is a functional document file, but its size is the double of what the word-processor creates, while including all the stuff that I had previously deleted!

So, somewhere in the process of storing a document from the word-processor, zip-compression is used, but the algorithm appears to be much better than (my) zip -9.

Do you know a compressor which creates zip and does compress better than zip ?

TIA

Michael

P.S. Test case: http://www.uplawski.eu/Test-files/test_files.zip. My file is new_test.tmdx, the text-processor creates test.tmdx. You can unzip them to see what I mean.

ondoho 12-09-2018 06:40 AM

maybe the command syntax is wrong?
maybe "zip -9r" won't work, but "zip -9 -r" will?

Michael Uplawski 12-09-2018 03:40 PM

Quote:

Originally Posted by ondoho (Post 5935077)
maybe the command syntax is wrong?
maybe "zip -9r" won't work, but "zip -9 -r" will?

No, that is all okay. As stated in the very first sentence, above, it was sufficient to just remove the useless white-space. This way I reduce a (very small) document from 16k to 1.3k, when zipped.

It is an obvious thing to do, but I had forgotten. The text-processors which save as OOXML do this routinely, or never write white-space where it is unnecessary. Only for manipulations in XML, I prefer to first apply a proper indentation to render it readable. This is enough to bloat up the document.

I am not sure, if it makes sense to keep this thread around, in case that anybody stumbles over the same phenomenon.., i.e. makes the same dumb mistake as I did. Else “ignore, I am dumb” will just suffice... ;)


All times are GMT -5. The time now is 12:30 PM.