简体   繁体   中英

php detect dpi of image in a pdf

I've tried a few tests using Imagick::getImageResolution on a PDF, and I can't figure out how to get the resolution (and colourspace) of an image embedded in a PDF. I've tried ripping the image out of the PDF, but during that process it seems the DPI is arbitrarily set to 72 not mater what I do.

I saw in 1564529 someone said DPI doesn't matter to a PDF, but that is not true (when an image is embedded in a PDF, several attributes about the image, like resolution, are defined in the PostScript). Is there a way in PHP (possibly with PSLib ?) to figure out what the DPI of an embedded image is?

The 'dpi' of an image in PDF (or PostScript) is more nebulous than you may think. This is because it is possible to render the PDF at different scales, and so the actul dpi will vary.

You are correct that there is information regarding the scale factor of the image mebedded in the document. This is the Current Transformation Matrix, but it is not as simple as a single value, or even a single matrix.

The CTM maps co-ordinates into an idealised 'user space' which is nominally defined in points (72 per inch), but is infinitely subdivisible. When it comes to rendering, the 'user space' has a further transformation applied to scale it properly to the 'device space', the transformation is required because the device probably isn't 72 dpi.

You can find a much fuller explanation of this in the PDF Reference Manual, especially section 4.2.1 in the 1.7 reference.

So it would seem that all you need to do is take the declared /Width and /Height from the image dictionary, and apply the /Matrix to determine how big the image is in user space. Given that user space is effectively 72 dpi, then you would know how many inches the image was scaled to, how many pixels the image contains, and a simple division would give you the answer you want.

Indeed, in may cases this will work. However, one of the problems from your point of view, is that is possible, indeed common, to concatenate matrices to affect the current scaling, so simply looking at the matrix applied to an image won't give you the scale factor applied to that image, because something else may have already scaled the CTM. In addition PDF contains the 'UserUnit' kludge which allows a file to alter the default scaling of user space.

So the only way to work out the 'dpi' of an image is to interpret the page description to the point where the image is rendered, work out the total scaling at that point and from there figure out how much area the image covers. Then given the width and height of the image, work out its dpi.

In passing, here's a conundrum for you; its entirely possible to draw the same image multiple times in PDF, using the same image data. You only have to include the image data once. If I draw an image which is 100 pixels by 100 pixels and I draw it to cover one square inch, the resolution is 100 dpi. Now I draw the same image, but I scale it to cover half an inch. The resolution of the rendered image is now 200 dpi.

So what is the 'dpi of the image' ?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM