Python：捕获对内存中文件的所有写入

Question

Is there some way to "capture" all attempted writes to a particular file /my/special/file , and instead write that to a BytesIO or StringIO object instead, or some other way to get that output without actually writing to disk? 有没有办法“捕获”所有尝试写入特定文件/my/special/file ，而是将其写入BytesIO或StringIO对象，或者以其他方式获取该输出而不实际写入磁盘？

The use case is: there's a 'handler' function, whose contract is that it should write its output to /my/special/file . 用例是：有一个'handler'函数，它的合同是它应该将它的输出写入/my/special/file 。 I don't have any control over this handler function -- I don't write it, I don't know its contents and I can't change its contents, and the contract cannot change. 我没有任何控制这个handler函数 - 我不写它，我不知道它的内容，我不能改变它的内容，合同不能改变。 I'd like to be able to do something like this: 我希望能够做到这样的事情：

# 'output' has whatever 'handler' has written to `/my/special/file`
output = handler.run(data)

Even if this is an odd request, I'd like to be able to do this even with a 'hackier' answer. 即使这是一个奇怪的请求，我也希望能够做到这一点，即使有一个“hackier”的答案。

EDIT: my code (and handler ) will be invoked many times on a lot of chunks of data, so performance (both latency and throughput) are important. 编辑：我的代码（和handler ）将在很多数据块上多次调用，因此性能（延迟和吞吐量）都很重要。

Thanks. 谢谢。

Answer 1

If you're talking about code in your own Python program, you could monkey-patch the built in open function before that code gets called. 如果你在自己的Python程序中讨论代码，你可以在调用代码之前修补内置的open函数。 Here's a really stupid example, but it shows that you can do this. 这是一个非常愚蠢的例子，但它表明你可以做到这一点。 This causes code that thinks it's writing to a file to instead write into an in-memory buffer. 这会导致认为正在写入文件的代码写入内存缓冲区。 The calling code then prints what the foreign code wrote to the file: 然后调用代码打印外部代码写入文件的内容：

import io

# The function you don't have access to that writes to a file
def foo():
    f = open("/tmp/foo", "w")
    f.write("blahblahblah\n")
    f.close()

# The buffer to contain the captured text
capture_buffer = ""

# My silly file-like object that only handles write(str) and close()
class MyFileClass:
    def write(self, str):
        global capture_buffer
        capture_buffer += str
    def close(self):
        pass

# patch open to return a MyFileClass instance
def my_open2(*args, **kwargs):
    return MyFileClass()
open = my_open2


# Call the target function
foo()

# Print what the function wrote to "the file"
print(capture_buffer)

Result: 结果：

blahblahblah

Sorry for not spending more time with this. 很抱歉没有花更多的时间。 Just showing you it's possible. 只是向你展示它是可能的。 As others say, a mocking module might be the way to go to not have to grow your own thing here. 正如其他人所说，一个模拟模块可能是不必在这里发展自己的东西的方式。 I don't know if they allow access to what is written. 我不知道他们是否允许访问所写的内容。 I guess they must. 我想他们一定是。 Such a module is just going to do a better job of what I've shown here. 这样的模块只会更好地完成我在这里展示的内容。

If your program does other file IO with open , or whichever method the mystery code uses to open the file, you'd check the incoming path and only return your special object if it was the one path you're interested in. Otherwise, you could just call the original open , which you could stash away under another name. 如果您的程序使用open其他文件IO，或者使用神秘代码打开文件的方法，您将检查传入路径并仅返回您的特殊对象（如果它是您感兴趣的路径）。否则，您可以打电话给原来open ，你可以用另一个名字藏匿。

Python：捕获对内存中文件的所有写入

问题描述

1 个解决方案

解决方案1
1 已采纳 2019-04-30 01:41:23

Python：捕获对内存中文件的所有写入

问题描述

1 个解决方案

解决方案1 1 已采纳 2019-04-30 01:41:23

解决方案1
1 已采纳 2019-04-30 01:41:23