简体   繁体   中英

How To Create a Python file-like Object from an Iterator

I am testing the throughput of writing to S3 from a python glue shell job by using the upload_fileobj function from the boto3 client. The input to this function is

Fileobj (a file-like object) -- A file-like object to upload. At a minimum, it must implement the read method, and must return bytes.

In order to have the test isolate just the throughput, as opposed to memory or CPU capabilities, I think the best way to use upload_file_object would be to pass an iterator that produces N bytes of the value 0 .

In python, how can a "file like object" be created from an iterator?

I'm looking for something of the form

from itertools import repeat

number_of_bytes = 1024 * 1024

zero_iterator = repeat(b'0', number_of_bytes)

file_like_object = something(zero_iterator) # fill in 'something'

Which would then be passed to boto3 for writing

session.client('s3').upload_fileobj(file_like_object, Bucket='my_bucket')

Thank you in advance for your consideration and response.

This is a simplified version of the answer at https://stackoverflow.com/a/70547492/1319998 , since we only need to deal with bytes , and so should be suitable for boto3's upload_fileobj

def to_file_like_obj(iterable):
    chunk = b''
    offset = 0
    it = iter(iterable)

    def up_to_iter(size):
        nonlocal chunk, offset

        while size:
            if offset == len(chunk):
                try:
                    chunk = next(it)
                except StopIteration:
                    break
                else:
                    offset = 0
            to_yield = min(size, len(chunk) - offset)
            offset = offset + to_yield
            size -= to_yield
            yield chunk[offset - to_yield:offset]

    class FileLikeObj:
        def read(self, size=-1):
            return b''.join(up_to_iter(float('inf') if size is None or size < 0 else size))

    return FileLikeObj()

If you have an iterable that yields bytes, my_iterable say, this can be used with boto3 as follows:

target_obj = boto3.Session().resource('s3').Bucket('my-target-bucket').Object('my/target/key')
target_obj.upload_fileobj(to_file_like_obj(my_iterable)))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM