简体   繁体   中英

PDF dimensions vs actual content dimension

I'm currently using php's imagick to convert some PDF to images - This works well for the small detail that the images are 'chopped' during output.

This is due to the difference in information contained on the PDF vs the actual content dimensions.

The PDF reports to be a 612x792 72ppi document, yet when I export an image from it via preview on the mac, the image is 1651x1275 - How is this possible?

Obviously the export is correct as the image is viewed correctly in those dimensions - Could it be that the PDF was simply wrongly encoded where the width and height were mixed up? How can I detect this via code? Also the image export is of a different (much larger) size, roughly twice the size, this leads me to believe some information isn't being read properly by imagick.

Basically I'd like to know if there is a proper way to determine the actual PDF content size, so that the images exported from it are at the best quality possible.

Thanks!

EDIT: (code added)

<?php
$im = new Imagick();
$im->readImage("SomeTest.pdf");
$im->setImageColorspace(255);
$im->setCompression(Imagick::COMPRESSION_JPEG);
$im->setCompressionQuality(60);
$im->setImageFormat('jpeg');
$im->writeImages("SampleImage.jpg");
?>

The pdf used is the following: http://www.pantone.com/pages/MYP_mypantone/software_downloader.aspx?f=3

Also, here is the output of imagick from the identifyImage() function, which seems a bit wrong looking at the file size.

Array
(
    [imageName] => /tmp/magick-XXehkI8e
    [format] => PDF (Portable Document Format)
    [geometry] => Array
        (
            [width] => 612
            [height] => 792
        )

    [type] => TrueColor
    [colorSpace] => RGB
    [resolution] => Array
        (
            [x] => 72
            [y] => 72
        )

    [units] => Undefined
    [fileSize] => 50mb
    [compression] => Undefined
    [signature] => 9426f3fc4f45afd71941435a37d585d01e01d32458f3ca241e72892c2f7f35d5
)

You should be aware that PDF on its own is a resolution-free format. Pages are described in a mathematical means that isn't tied to any particular resolution limit except for those imposed by floating point numbers.

PDF only truly has resolution when it is rendered to a particular device (and that may or may not be at the device's resolution).

"But what about images? Images in PDFs surely give it resolution!" Sort of. Images in PDF are represented as unit-free samples and do not themselves have resolution until they are have been instantiated on a page. I can take a 300 dpi 8.5"x11" 1-bit image and embed it into a PDF, but that same image can be put into the content stream of a page in a space that fills an entire 8.5"x11" space, thus maintaining the resolution or it could be rendered into a much smaller thumbnail (creating a higher resolution through the scale) - and even those "resolutions" don't apply until the page is actually rendered to a device. In addition, PDF renderers are not prevented from doing bilinear (or some other) interpolation to increase the apparent resolution of an image.

To give you a much more concrete example, if I render a PDF page on a 96 dpi monitor at 100%, the resolution of that page is no greater than 96 dpi. If I render that PDF page on an 1800 dpi phototypesetter, the resolution of the page is no greater than 1800 dpi.

If I render a 300 dpi image at 100% on a PDF page rendered at 100% on a 96 dpi monitor, the resolution of the image on the page is 96 dpi. If I render a 300 dpi image at 100% on a PDF page rendered at 100% on an 1800 dpi phototypesetter, the resolution of the image on the page is 300 dpi.

The output you are seeing from image magick is probably reflecting that an 8.5" x 11" page in PDF units is 612 x 792 and 1 PDF unit is equivalent to 1/72 of an inch. The preview rendering appears to being done at ~194 dpi.

The image within the PDF was scaled down to some size within the PDF (or it would be cropped when you look at it in Reader et al).

ImageMagick (which I ass-u-me imagick uses) uses GhostScript to convert PDFs to images. GhostScript is Quite Good at rendering PDF files. I have to wonder if you're passing some bad info along.

Can we see some code? Links to your input PDF and output image[s] would be nice too.


I just ran gs 8.71 on your PDF, and it rendered fine. What version of GhostScript are you using?

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM