LinuxQuestions.org

LinuxQuestions.org (/questions/)
-   Linux - Software (https://www.linuxquestions.org/questions/linux-software-2/)
-   -   Ghostscript: why is pdf size sometimes increased so much? (https://www.linuxquestions.org/questions/linux-software-2/ghostscript-why-is-pdf-size-sometimes-increased-so-much-223014/)

J_Szucs 08-27-2004 06:16 AM

Ghostscript: why is pdf size sometimes increased so much?
 
I had two pdf files each of size 100k, containing scanned images.
Those pdfs were probably created by Adobe Acrobat.

I merged them with ghostscript, as follows:

gs -dNOPAUSE -sDEVICE=pswrite -sOutputFile=1.ps -dBATCH 1.pdf
gs -dNOPAUSE -sDEVICE=pswrite -sOutputFile=2.ps -dBATCH 2.pdf
gs -dNOPAUSE -sDEVICE=pswrite -sOutputFile=merged.ps -dBATCH 1.ps 2.ps
gs -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=merged.pdf -sProcessColorModel=DeviceGray -sBitsPerSample=1 -dCompressPages=true -dBATCH merged.ps

The problem is that the resulting file (merged.pdf) is about 15 times as big as it should be (3M instead of 200k), and, maybe because of this, it is very-very slowly rendered on the screen by acroread.

The result is about the same either with or without the ProcessColorModel, BitsPerSample and CompressPages options, which I tried to use to reduce file size.

Strangely, I found that the -r option of gs can only grow file size (-r60 resulted in a cca. 1.5 times larger file).

I wonder what is wrong with those pdf files (or with ghostscript), that they become so large when extracted to ps and re-encoded to pdf?

rjlee 08-27-2004 07:56 AM

PostScript is an interpreted language, where everything on the page is divided into a number of different drawing instructions that get processed to build up the page. I believe that PDF is similar (but I may be wrong).

I suspect that what's happening is that the conversion to PS takes each PDF instruction and converts it into a number of different postscript instructions, and the conversion back to PDF takes each PS instruction and converts it into several PDF instructions.

You might try writing the output file to postscript and using the ps2pdf utility to convert it, which has a number of options to control how the processing works, and what optimisations are done etc.

Hope that's of some help,

— Robert J. Lee

J_Szucs 08-27-2004 09:19 AM

I think that ps2pdf actually calls gs to do the conversion in a way that any option to ps2pdf that affects the resulting pdf file is simply passed to gs.
So, those options are actually the options of gs, which I carefully checked.

I suspect that my problem may rather be related to the fact that the pdfs to be merged are of pdf version 1.4, which pdf version my gs 7.04 can only read, but cannot write, so, maybe it writes in pdf version 1.3, which cannot compress the contents so efficiently. (just realised that I have no such problem with pdf files up to version 1.3)


All times are GMT -5. The time now is 08:53 AM.