[英]How To Create a Python file-like Object from an Iterator
I am testing the throughput of writing to S3
from a python glue
shell job by using the upload_fileobj
function from the boto3
client.我正在使用来自boto3
客户端的upload_fileobj
function 作业测试从 python glue
shell 作业写入S3
的吞吐量。 The input to this function is这个 function 的输入是
Fileobj (a file-like object) -- A file-like object to upload. Fileobj(类文件对象)——要上传的类文件 object。 At a minimum, it must implement the read method, and must return bytes.至少,它必须实现 read 方法,并且必须返回字节。
In order to have the test isolate just the throughput, as opposed to memory or CPU capabilities, I think the best way to use upload_file_object would be to pass an iterator
that produces N
bytes of the value 0
.为了让测试仅隔离吞吐量,而不是 memory 或 CPU 功能,我认为使用 upload_file_object 的最佳方法是传递一个iterator
,该迭代器产生N
个字节的值0
。
In python, how can a "file like object" be created from an iterator?在 python 中,如何从迭代器创建“类文件对象”?
I'm looking for something of the form我正在寻找某种形式的东西
from itertools import repeat
number_of_bytes = 1024 * 1024
zero_iterator = repeat(b'0', number_of_bytes)
file_like_object = something(zero_iterator) # fill in 'something'
Which would then be passed to boto3 for writing然后将其传递给 boto3 进行编写
session.client('s3').upload_fileobj(file_like_object, Bucket='my_bucket')
Thank you in advance for your consideration and response.预先感谢您的考虑和回复。
This is a simplified version of the answer at https://stackoverflow.com/a/70547492/1319998 , since we only need to deal with bytes
, and so should be suitable for boto3's upload_fileobj
这是https://stackoverflow.com/a/70547492/1319998答案的简化版本,因为我们只需要处理bytes
,因此应该适合 boto3 的upload_fileobj
def to_file_like_obj(iterable):
chunk = b''
offset = 0
it = iter(iterable)
def up_to_iter(size):
nonlocal chunk, offset
while size:
if offset == len(chunk):
try:
chunk = next(it)
except StopIteration:
break
else:
offset = 0
to_yield = min(size, len(chunk) - offset)
offset = offset + to_yield
size -= to_yield
yield chunk[offset - to_yield:offset]
class FileLikeObj:
def read(self, size=-1):
return b''.join(up_to_iter(float('inf') if size is None or size < 0 else size))
return FileLikeObj()
If you have an iterable that yields bytes, my_iterable
say, this can be used with boto3 as follows:如果你有一个产生字节的迭代, my_iterable
说,这可以与 boto3 一起使用,如下所示:
target_obj = boto3.Session().resource('s3').Bucket('my-target-bucket').Object('my/target/key')
target_obj.upload_fileobj(to_file_like_obj(my_iterable)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.