从python运行命令行并从内存管道参数

Question

I was wondering if there was a way to run a command line executable in python, but pass it the argument values from memory, without having to write the memory data into a temporary file on disk. 我想知道是否有办法在python中运行命令行可执行文件，但是从内存中传递参数值，而不必将内存数据写入磁盘上的临时文件。 From what I have seen, it seems to that the subprocess.Popen(args) is the preferred way to run programs from inside python scripts. 从我所看到的，似乎subprocess.Popen（args）是从python脚本中运行程序的首选方式。

For example, I have a pdf file in memory. 例如，我在内存中有一个pdf文件。 I want to convert it to text using the commandline function pdftotext which is present in most linux distros. 我想使用大多数Linux发行版中的命令行函数pdftotext将其转换为文本。 But I would prefer not to write the in-memory pdf file to a temporary file on disk. 但我不希望将内存中的pdf文件写入磁盘上的临时文件。

pdfInMemory = myPdfReader.read()
convertedText = subprocess.<method>(['pdftotext', ??]) <- what is the value of ??

what is the method I should call and how should I pipe in memory data into its first input and pipe its output back to another variable in memory? 我应该调用的方法是什么？如何将内存数据输入其第一个输入并将其输出传输回内存中的另一个变量？

I am guessing there are other pdf modules that can do the conversion in memory and information about those modules would be helpful. 我猜测还有其他pdf模块可以在内存中进行转换，有关这些模块的信息会有所帮助。 But for future reference, I am also interested about how to pipe input and output to the commandline from inside python. 但是为了将来参考，我也对如何从python内部管道输入和输出到命令行感兴趣。

Any help would be much appreciated. 任何帮助将非常感激。

Answer 1

with Popen.communicate : 与Popen.communicate ：

import subprocess
out, err = subprocess.Popen(["pdftotext", "-", "-"], stdout=subprocess.PIPE).communicate(pdf_data)

Answer 2

os.tmpfile is useful if you need a seekable thing. 如果你需要一个可寻找的东西， os.tmpfile很有用。 It uses a file, but it's nearly as simple as a pipe approach, no need for cleanup. 它使用一个文件，但它几乎像管道方法一样简单，不需要清理。

tf=os.tmpfile()
tf.write(...)
tf.seek(0)
subprocess.Popen(  ...    , stdin = tf)

This may not work on Posix-impaired OS 'Windows'. 这可能不适用于Posix受损的操作系统'Windows'。

Answer 3

Popen.communicate from subprocess takes an input parameter that is used to send data to stdin, you can use that to input your data. 来自子进程的Popen.communicate接受一个用于将数据发送到stdin的输入参数，您可以使用它来输入您的数据。 You also get the output of your program from communicate , so you don't have to write it into a file. 您还可以通过communicate获得程序的输出，因此您无需将其写入文件。

The documentation for communicate explicitly warns that everything is buffered in memory, which seems to be exactly what you want to achieve. 用于通信的文档明确警告所有内容都缓存在内存中，这似乎正是您想要实现的内容。

从python运行命令行并从内存管道参数

问题描述

3 个解决方案

解决方案1
2 已采纳 2010-09-19 09:50:47

解决方案2
2 2010-09-19 14:16:27

解决方案3
1 2010-09-19 09:50:19

从python运行命令行并从内存管道参数

问题描述

3 个解决方案

解决方案1 2 已采纳 2010-09-19 09:50:47

解决方案2 2 2010-09-19 14:16:27

解决方案3 1 2010-09-19 09:50:19

解决方案1
2 已采纳 2010-09-19 09:50:47

解决方案2
2 2010-09-19 14:16:27

解决方案3
1 2010-09-19 09:50:19