简体   繁体   English

使用R,有人如何计算PDF文件中的页数?

[英]Using R, how can someone count the number of pages in a PDF file?

I have about a hundred long PDF files in a directory and would like to know whether R can count how many pages are in each file. 我在目录中有大约一百个长PDF文件,并想知道R是否可以计算每个文件中有多少页面。 My operating system is Windows 8. 我的操作系统是Windows 8。

Here is the link to a 10-page PDF file, in case this helps you test your solution. 以下是10页PDF文件的链接,以防这有助于您测试解决方案。 MWE pdf file MWE pdf文件

It appears to be possible to count PDF pages with python, but I don't know that language python solution . 似乎可以使用python计算PDF页面,但我不知道语言python解决方案 Other solutions have been discussed on SO using, eg, Imagemagick. 已经使用例如Imagemagick在SO上讨论了其他解决方案。 and C##. 和C ##。

I'm working on a Windows 7 machine, but my experiences on Windows 8 make me think it should work just as well for you. 我正在使用Windows 7机器,但我在Windows 8上的经验让我觉得它应该对你有用。

I wasn't able to compile the Rpoppler package, and as hrbrmstr points out, it's probably not worth fighting. 我无法编译Rpoppler包,正如hrbrmstr指出的那样,它可能不值得战斗。 If you have 7-Zip, you can extract the poppler tools for Windows. 如果你有7-Zip,你可以提取Windows的poppler工具。 I've extracted them to the location C:\\poppler . 我已将它们提取到位置C:\\poppler Once there, I can do the following 到那里,我可以做到以下几点

file_name <- "C:/[file_path]/whitepaper-pdfprimer.pdf"

pdf_pages <- function(file_name){
  require(magrittr)
  pages <- system2("C:/poppler/bin/pdfinfo.exe",
                   args = file_name,
                   stdout = TRUE)
  pages[grepl("Pages:", pages)] %>%
    gsub("Pages:", "", .) %>%
    as.numeric()
}

pdf_pages(file_name)

And if you have a vector of file names you want to pass 如果你有一个你希望传递的文件名向量

vapply(file_names, pdf_pages, numeric(1))

Credit to @hrbrmstr for pointing out the poppler tools (I'd never heard of them until today). 感谢@hrbrmstr指出了poppler工具(我直到今天才听说过它们)。

On R version 3.3.2 pdftools works: 在R版本3.3.2上pdftools工作原理:

library(pdftools)
pdfInfo <- pdf_info(<path to PDF file>)
pdfInfo$pages

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM