简体   繁体   English

是否可以导入PDF文件的栅格?

[英]Is it possible to import a raster of a PDF file?

Our office does scanning of data entry forms, and we lack any proprietary software that is able to do automated double-entry (primary entry is done by hand, of course). 我们的办公室会扫描数据输入表格,而我们缺少任何能够进行自动重复输入的专有软件(当然,主要输入是手工完成的)。 We are hoping to provide a tool for researchers to highlight regions on forms and use scanned versions to determine what participant entry was. 我们希望为研究人员提供一个工具,以突出显示表单上的区域并使用扫描的版本来确定参与者的条目。

To do this, all I need for a very rough attempt is a file to read in PDFs as raster files, with coordinates as X, Y components, and B&W white "intensities" as a Z-axis. 为此,我需要做一个非常粗略的尝试,就是以PDF文件的形式读取光栅文件,将坐标作为X,Y分量,将B&W白色“强度”作为Z轴。

We use R mainly for statistical analysis and data management, so options in R would be great. 我们主要将R用于统计分析和数据管理,因此R中的选项会很棒。

You could use the raster package from R. However, it doesnt support .pdf files, but .tif,.jpg,.png (among many others). 您可以使用R中的栅格数据包。但是,它不支持.pdf文件,但不支持.tif,.jpg,.png(还有许多其他文件)。 But coverting your pdfs into pngs shouldn't be a big problem: Look here for more information. 但是将pdf覆盖为png并不是什么大问题: 在此处查找更多信息。

Once you have your png files ready, you can do the following: 准备好png文件后,您可以执行以下操作:

png <- raster("your/png/file.png")

and then use the extract() function to get your brigthness value from the picture. 然后使用extract()函数从图片中获取亮度值。 Ie let's say your png is 200x200px and you want to extract a pixel value from row 100 and column 150: 也就是说,假设您的png为200x200px,并且您要从第100行和第150列中提取像素值:

value <- extract(png, c(150,100))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM