ProgrammingThis forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
I store all documents electronically. Sometimes I can print the invoice as PDF from a website (e.g. amazon). Sometimes I have to scan the image myself and convert it to pdf.
Using Imagemagick's identify these are the resolutions for the .jpg and .pdf: iomega-p.jpg JPEG 1653x2337 1653x2337+0+0 8-bit DirectClass 563KB 0.010u 0:00.010
iomega-p.pdf PDF 1653x2337 1653x2337+0+0 16-bit Bilevel DirectClass 484KB 0.140u 0:00.030
So far, so good.
Now I have a different application which I use to view, annotate and store the PDFs. In that application I want to view the PDFs on screen. So I use the external gs command to convert the PDF to PNG so I can display it on screen.
Here is where the trouble starts, as gs converts the image to a much higher resolution. iomega-p.png PNG 4592x6492 4592x6492+0+0 8-bit DirectClass 6.196MB 1.880u 0:01.869
I don't need this, and it makes processing an order of magnitude slower. That is I have to wait longer before the image is being displayed.
When I change the resolution to 72 dpi, the resulting PNG is exactly the correct resolution.
However, I cannot do that in the application. Because the PDF files which I did not scan, but print from the web browser have a much lower resolution: amazon.pdf PDF 612x792 612x792+0+0 16-bit Bilevel DirectClass 61KB 0.030u 0:00.009
Converting those at 72 dpi produce way too low a quality. amazon.png PNG 612x792 612x792+0+0 8-bit DirectClass 18.1KB 0.030u 0:00.029
Questions:
- Why is gs converting back at a resolution of 4592x6492? The original was 200 dpi and 1653x2337.
- Why is a printed PDF at a resolution of 612x792 perfectly readable in Okular, while it is apparently only 72 dpi?
- What am I doing wrong here?
I store all documents electronically. Sometimes I can print the invoice as PDF from a website (e.g. amazon). Sometimes I have to scan the image myself and convert it to pdf.
Using Imagemagick's identify these are the resolutions for the .jpg and .pdf: iomega-p.jpg JPEG 1653x2337 1653x2337+0+0 8-bit DirectClass 563KB 0.010u 0:00.010
iomega-p.pdf PDF 1653x2337 1653x2337+0+0 16-bit Bilevel DirectClass 484KB 0.140u 0:00.030
So far, so good.
Now I have a different application which I use to view, annotate and store the PDFs. In that application I want to view the PDFs on screen. So I use the external gs command to convert the PDF to PNG so I can display it on screen.
Here is where the trouble starts, as gs converts the image to a much higher resolution. iomega-p.png PNG 4592x6492 4592x6492+0+0 8-bit DirectClass 6.196MB 1.880u 0:01.869
I don't need this, and it makes processing an order of magnitude slower. That is I have to wait longer before the image is being displayed.
When I change the resolution to 72 dpi, the resulting PNG is exactly the correct resolution.
However, I cannot do that in the application. Because the PDF files which I did not scan, but print from the web browser have a much lower resolution: amazon.pdf PDF 612x792 612x792+0+0 16-bit Bilevel DirectClass 61KB 0.030u 0:00.009
Converting those at 72 dpi produce way too low a quality. amazon.png PNG 612x792 612x792+0+0 8-bit DirectClass 18.1KB 0.030u 0:00.029
Questions:
- Why is gs converting back at a resolution of 4592x6492? The original was 200 dpi and 1653x2337.
- Why is a printed PDF at a resolution of 612x792 perfectly readable in Okular, while it is apparently only 72 dpi?
- What am I doing wrong here?
jlinkels
Try using -r with two values. According to the man page, -r with a single number sets both x and y to that resolution. Experimenting here shows a huge difference in output size if I use only a single number.
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Original Poster
Rep:
Exactly the same result:
jlinkels@donald-pc:/tmp$ identify iomega-p.pdf
iomega-p.pdf PDF 1653x2337 1653x2337+0+0 16-bit Bilevel DirectClass 484KB 0.150u 0:00.040
jlinkels@donald-pc:/tmp$ gs -sPAPERSIZE=a4 -sDEVICE=png16m -r200x200 -o iomega-p.png iomega-p.pdf
GPL Ghostscript 8.71 (2010-02-10)
Copyright (C) 2010 Artifex Software, Inc. All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
jlinkels@donald-pc:/tmp$ identify iomega-p.png
iomega-p.png PNG 4592x6492 4592x6492+0+0 8-bit DirectClass 6.196MB 1.690u 0:01.690
I think your confusion is about resolution in the PDF. A PDF doesn't not have any resolution by itself. It has page size in points (1/72 of an inch). But inside it, pictures can have any resolution and it can contain text and vector graphics. Inside the PDF, bitmaps pictures are stored with pixels mostly as they are (some formats need re-encoding), and one picture can be used many times on a page with different resolution. The postscript basically says "the bitmap picture with id X" should be placed in "specified rectangle".
So when you specify r200 you get a upscaled PNG. Ghostscript makes a bitmap image the size of the source, but instead of 72 DPI it will be 200 DPI. So it will be a bigger picture (4592x6492). Then it renders the PDF on that image and saves it as PNG. Because the picture really is smaller, it has to scale it.
When converting from PDF to PNG, it's usually better to use ImageMagick's convert directly. But "convert a.pdf a.png" can look ugly. To make it clearer add -density and some number higher than 72. The -density parameter doesn't affect the size of the resulting PNG, but matter how it's rendered in memory. For example, if you use -density 288, it will make a big bitmap in memory, render the PDF onto that, and then scale it down, an effect a bit like antializing.
Another way is to use pdfimages from poppler-utils. Then you can extract the bitmap pictures as they are inside the PDF, no quality loss at all.
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Original Poster
Rep:
Ok, let me see if I understand this.
Suppose I scan an image with 200dpi. The resulting image is 1600x2400 pixels. I use convert to convert it to PDF. Now there is a 1600x2400 image on the PDF. This image is A4 size so it fills the complete page in the PDF.
Now I use GS to convert the PDF to a raster image. GS assumes the PDF is 72dpi. So when I request r200, it upscales everything in the PDF with 200/72 = 2.77. Hence the resulting image is (1600x2400) x 2.7 = 4400 x 6600.
OTOH, when I request r72, GS doesn't upscale and gives me back whatever happened to be in the PDF, which is 1600x2400. That the image itself happens to be 200dpi on an 8" wide page is opaque to GS.
That is exactly what happens.
Now in case the PDF was created by printing an arbitrary document containing mostly text.
Again GS assumes this is 72dpi. So when I request a conversion from pdf with r72, GS produces a page which is actually 8" and characters are rasterized at 72dpi. Which looks ugly.
I order to have an acceptable rendered picture I should use r200 to tell GS to rasterize the characters at 200dpi.
Is that all correct?
I did do the conversion to bitmap at first with convert using -density, but then I modified it to use GS. I am not sure why, maybe it was because it was said that GS is the recommended way for the conversion, maybe because I found that GS was faster. The results in terms of functionality and resolution are identical.
I read about poppler as well. However my program is in TCL and no API is available for TCL. I didn't really want to start a project on PDF conversion. The application is some kind of document management system. I just want to view what is inside the PDF with reasonable quality.
This will scale it to 1600x2400 no matter what the source is. (Actually it will be 1600px wide or 2400 pixels high depending on the ascpect ratio).
The density will not affect the resulting size at all. Imagemagick will make a huge picture in memory, ghostscript will render the PDF on it, and then scale it down. Maybe it's not neccessary to use such a high value. It will use a lot of memory and be slow. The effect is the same as if you use r200 with ghoscript and scale it down after.
Another thing it can be better to use a multiply of 72 for density, for example 288. The downscaling will be easier and can be better if for example 8 pixels corresponds to 1 pixel in the result.
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Original Poster
Rep:
Quote:
Originally Posted by Guttorm
You want the PNG to have a size corresponding to a screen size no? 1600x2400 is a bit strange. Anyway, this should make a decent result:
It is what I get when I scan an 8" page with 200 dpi. It doesn't have anything to do with screen size. I think 200dpi is a reasonable quality. 150 dpi is too low for my taste, 300dpi doesn't bring a significant improvement in picture quality. For my purpose that is.
Quote:
Originally Posted by Guttorm
The density will not affect the resulting size at all. Imagemagick will make a huge picture in memory, ghostscript will render the PDF on it, and then scale it down. Maybe it's not neccessary to use such a high value. It will use a lot of memory and be slow. The effect is the same as if you use r200 with ghoscript and scale it down after.
What you say here is not very clear. When I render the pdf with density=72 or density=200 makes a huge difference in file size.
Now this doesn't say a thing. Because there is no information added, the 72dpi png and the 200dpi png compress to roughly the same size. (Compressing is removing redundancy. The 200 dpi file contains a lot of redundancy because the information is the same as in the 72 dpi file)
-rw-r--r-- 1 jlinkels jlinkels 8.1M May 9 00:49 sgb2-200.png
-rw-r--r-- 1 jlinkels jlinkels 5.9M May 9 00:48 sgb2-72.png
The reason you get different file size and picture size is that you don't specify -geometry. If you specify geometry the picture size will be the same and the file size will differ slightly. If you compare the two files the 200 file should be sharper and the text more readable.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.