[英]How to check PDF pages for resolution (DPI) of embedded images?
Is there any free library, that can be used to get resolution of images in DPI contained by PDF file? 是否有任何免费的库可用于获取PDF文件包含的DPI中的图像分辨率?
I've tried the following code, using PDFSharp but the DPI it returns is not correct. 我使用PDFSharp尝试了以下代码,但返回的DPI不正确。 For example it shows 96dpi while it should be 150dpi: 例如,它显示96dpi,而应为150dpi:
using (PdfDocument pdf = PdfReader.Open(sourcePdf))
{
for (int i = 0; i < pdf.Pages.Count; i++)
{
XGraphics xGraphics = XGraphics.FromPdfPage(pdf.Pages[i]);
float dpi = xGraphics.Graphics.DpiX;
}
}
You can use a command line tool to get the info you need: pdfimages
. 您可以使用命令行工具获取所需的信息: pdfimages
。
However, you need a recent version pdfimages
that is based on the Poppler library ( NOT the 'pdfimages' that is based on XPDF !) 但是,您需要基于Poppler库的最新版本pdfimages
(而不是基于XPDF的“ pdfimages”!)
Recent Poppler versions let you use the -list
option: 最新的Poppler版本使您可以使用-list
选项:
pdfimages -list -f 2 -l 4 my.pdf
The output of above example command shows all images in the page range from 2 ( f irst page to show) to 4 ( l ast page to show). 上面的示例命令的输出示出了从2至4中的页面范围的所有图像( 开始步骤页面显示)( 升 AST页面来显示)。
Here is the output for the above command, using an example PDF file I prepared specifically for this question (scroll horizontally to see all columns): 这是上述命令的输出,使用的是我专门为该问题准备的示例PDF文件(水平滚动以查看所有列):
page num type width height color comp bpc enc interp object ID x-ppi y-ppi size ratio
---------------------------------------------------------------------------------------
2 0 image 697 1238 gray 1 8 jpeg no 16 0 320 320 142K 17%
3 1 image 697 1238 gray 1 8 jpeg no 16 0 151 151 142K 17%
4 2 image 697 1238 gray 1 8 jpeg no 16 0 84 115 142K 17%
The output shows the following: 输出显示以下内容:
There are three images on the three pages 2-4 (as indicated by columns 1+2, headed page
and num
). 在三个页面2-4上有三个图像(如第1 + 2列, page
和num
)。
The PDF object IDs for all three images are identical: 16 0
(as indicated by columns 11+12, headed object
+ ID
). 所有三个图像的PDF 对象ID都是相同的: 16 0
(如第11 + 12列所示,标题object
+ ID
)。 This means the PDF has only one distinct object defined, but showing it three times (ie, the image is embedded only once, but appears on 3 pages). 这意味着PDF仅定义了一个不同的对象,但显示了3次(即,图像仅嵌入一次,但出现在3页上)。
The image's width is 697
pixels, its height is 1238
pixels, its image depth (bits per color) is 8
, its colorspace is gray
its number of color channels/components is 1
, its compression scheme is jpeg
, its bytesize (as embedded) is 142K
, its compression rate is 17%
(as indicated by columns 4-9 and 14+15 headed width
, height
, color
, comp
, bpc
, size
and ratio
). 图像的宽度为697
像素,高度为1238
像素,图像深度(每种颜色的位数)为8
,颜色空间为gray
,颜色通道/组件数为1
,压缩方案为jpeg
,字节大小(嵌入)为142K
,其压缩率为17%
(如第4-9和14 + 15列所示, 142K
width
, height
, color
, comp
, bpc
, size
和ratio
)。
However, the same image appears on different pages in different resolutions (given as PPI -- pixels per inch --- not DPI ): 但是,同一张图片会以不同的分辨率出现在不同的页面上(以PPI表示 -每英寸像素---而不是DPI ):
page 2 shows it with a PPI of 320
in both directions, 第2页显示了双向的PPI为320
,
page 4 shows it with a PPI of 151
in both directions, 第4页显示了双向的PPI为151
,
while page 3 shows it with a PPI of 84
in horizontal (X) direction and 115
PPI in vertical (Y) direction. 而第3页显示了水平(X)方向的PPI为84
,垂直(Y)方向的PPI为115
。
Now, if a command line tool cannot be re-purposed for your goal: the Poppler library which is the base for the tool shown above certainly is Free ( 'free as in liberty' , as well as 'free as in beer' ). 现在,如果不能将命令行工具重新用于您的目标:上面显示的工具的基础Poppler库肯定是Free( “自由自在” ,以及“啤酒自在” )。
Here is a link to the PDF ( "my.pdf" ) I used to demonstrate the output of the command above. 这是我用来演示上述命令输出的PDF( “ my.pdf” )的链接 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.