如何获取 Mac OS 或 Windows 10 文件夹下所有 PDF 文件的字数

Question

I know some way to getting word count for a single PDF Files, but I have a folder which contains 500+ PDF files so I would like to know if there is a faster way to get the word count for all of them without opening every single file and do the copy past stuff like that.我知道一些方法来获取单个 PDF 文件的字数，但是我有一个文件夹，其中包含 500+ PDF 文件，所以我想知道是否有更快的方法来获取所有文件的字数，而无需打开每个文件文件并复制过去的东西。

I'm using macOS Catalina 10.15.5, If there is a solution for Windows 10 that also fine for me.我正在使用 macOS Catalina 10.15.5，如果有 Windows 10 的解决方案对我来说也很好。

Answer 1

I just launched following command on my Windows machine:我刚刚在我的 Windows 机器上启动了以下命令：

Prompt>dir *.txt /S

There was an enormous output, and at the end, there was:有一个巨大的output，最后是：

     Total Files Listed:
            3620 File(s)     93.074.638 bytes
               0 Dir(s)  410.585.006.080 bytes free

Edit after first comment在第一条评论后编辑
PDF is a format, which is made to be human-readable, not computer-readable, so doing some parsing and making some calculations on it, just using some simple computer commands, I don't believe it is even possible. PDF 是一种格式，它被制成人类可读的，而不是计算机可读的，所以对其进行一些解析和计算，只是使用一些简单的计算机命令，我不相信它是可能的。

Answer 2

You can use pdfgrep which you can install with homebrew using:您可以使用pdfgrep ，您可以使用homebrew安装它：

brew install pdfgrep

Then your command to count the words in all the files will be:然后，您计算所有文件中单词的命令将是：

pdfgrep -c -P  "\b.*\b"  *.pdf

Sample Output样品 Output

Arduino Wireless Communication With the HC-12.pdf:512
sample.pdf:0
simple.pdf:4
text.pdf:22

The -P means to use PCRE , or "Perl Compatible Regular Expressions" wherein \b signifies a word boundary - ie the start or end of a word. -P表示使用PCRE或“Perl 兼容正则表达式” ，其中\b表示单词边界 - 即单词的开头或结尾。

如何获取 Mac OS 或 Windows 10 文件夹下所有 PDF 文件的字数

问题描述

2 个解决方案

解决方案1
0 2020-11-27 09:46:44

解决方案2
0 2020-11-27 11:13:16

如何获取 Mac OS 或 Windows 10 文件夹下所有 PDF 文件的字数

问题描述

2 个解决方案

解决方案1 0 2020-11-27 09:46:44

解决方案2 0 2020-11-27 11:13:16

解决方案1
0 2020-11-27 09:46:44

解决方案2
0 2020-11-27 11:13:16