Python脚本，用于对文件夹中的所有文件运行命令

Question

For converting pdf to text I am using the following command: 为了将pdf转换为文本，我使用以下命令：

pdf2txt.py -o text.txt example.pdf # It will convert example.pdf to text.txt

But I have more than 1000 pdf files which I need to convert to text file first and then do the analysis. 但我有超过1000个pdf文件，我需要先转换为文本文件，然后进行分析。

Is there a way through which I can use this command to iterate over the pdf files and convert all of them? 有没有办法可以使用此命令迭代pdf文件并转换所有这些文件？

Answer 1

I would suggest you to have a shell script: 我建议你有一个shell脚本：

for f (*.pdf) {pdf2txt.py -o $f $f.txt}

Then read all .txt files using python for your analysis. 然后使用python读取所有.txt文件以进行分析。

Using only python to convert: 仅使用python转换：

from subprocess import call
import glob

for pdf_file in glob.glob('*.pdf'): 
    call(["pdf2txt.py", "-o", pdf_file, pdf_file[:-3]+"txt"])

Answer 2

the python code went wrong on my win1o OS( OSError: [WinError 193] %1 is not a valid Win32 application), the for loop should be: 我的win1o操作系统上的python代码出错了（OSError：[WinError 193]％1不是有效的Win32应用程序），for循环应该是：

for pdf_file in glob.glob('*.pdf'):
    call(['python.exe','pdf2txt.py','-o',pdf_file[:-3]+'txt',pdf_file])

Be careful, the parameter of file i/o is opposite, if you remain the same order, your files would be overwritten by empty files... 注意，文件i / o的参数是相反的，如果你保持相同的顺序，你的文件将被空文件覆盖......

Still thanks Gurupad Hegde, show me the way to covert files, it helps a lot! 还要感谢Gurupad Hegde，告诉我隐藏文件的方法，它有很多帮助！

Python脚本，用于对文件夹中的所有文件运行命令

问题描述

2 个解决方案

解决方案1
3 已采纳 2015-06-03 15:51:21

解决方案2
0 2016-08-25 16:15:42

Python脚本，用于对文件夹中的所有文件运行命令

问题描述

2 个解决方案

解决方案1 3 已采纳 2015-06-03 15:51:21

解决方案2 0 2016-08-25 16:15:42

解决方案1
3 已采纳 2015-06-03 15:51:21

解决方案2
0 2016-08-25 16:15:42