Python aiohttp 模块：不明确的 .content 属性

Question

Here is a little code snippet:这是一个小代码片段：

import aiohttp
import aiofiles

async def fetch(url):
    # starting a session
    async with aiohttp.ClientSession() as session:
        # starting a get request
        async with session.get(url) as response:
            # getting response content
            content = await response.content
            return content
 
async def save_file(file_name, content):
    async with aiofiles.open(f'./binary/{file_name}', 'wb') as f:
      while True:
            chunk = content.read(1024)
            if not chunk:
                break
            f.write(chunk)

I am trying to download some binary files using the aiohttp library and then passing them to a coroutine using aiofiles library to write the file in the disk.我试图下载使用一些二进制文件aiohttp库，然后使用它们传递给协程aiofiles图书馆写在磁盘上的文件。 I have read the documentation but still couldn't figure out if I can pass content = await response.content or is it closed when the handle async with.. is closed?我已经阅读了文档，但仍然无法弄清楚我是否可以传递content = await response.content或者当句柄async with..关闭时它是否关闭？ Because on a secondary blog , I found:因为在二级博客上，我发现：

According to aiohttp's documentation, because the response object was created in a context manager, it technically calls release() implicitly.根据 aiohttp 的文档，因为响应对象是在上下文管理器中创建的，所以它在技术上隐式调用了 release()。

Which confuses me, should I embed the logic of the second function inside the response handle or is my logic correct?这让我感到困惑，我应该在response句柄中嵌入第二个函数的逻辑还是我的逻辑正确？

Answer 1

The async context manager will close the resources related to the request, so if you return from the function, you have to make sure you've read everything of interest.异步上下文管理器将关闭与请求相关的资源，因此如果您从该函数返回，则必须确保您已阅读所有感兴趣的内容。 So you have two options:所以你有两个选择：

read the entire response into memory, eg with content = await response.read() or, if the file doesn't fit into memory (and also if you want to speed things up by reading and writing in parallel)将整个响应读入内存，例如使用content = await response.read()或者，如果文件不适合内存（以及如果您想通过并行读写来加快速度）
use a queue or an async iterator to parallelize reading and writing.使用队列或异步迭代器来并行化读写。

Here is an untested implementation of #2:这是 #2 的未经测试的实现：

async def fetch(url):
    # return an async generator over contents of URL
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            # getting response content in chunks no larger than 4K
            for chunk in response.content.iter_chunked(4096):
                yield chunk
 
async def save_file(file_name, content_iter):
    async with aiofiles.open(f'./binary/{file_name}', 'wb') as f:
        for chunk in content_iter:
            f.write(chunk)  # maybe you need to await this?

async def main():
    save_file(file_name, fetch(url))

Answer 2

Thanks to user4815162342 's code I could find a solution by parellelizing the fetch and write coroutines.感谢user4815162342的代码，我可以通过并行化获取和写入协程来找到解决方案。 I would've checked his code as the accepted solution but since I had to add some code to make it work, here it is:我会检查他的代码作为公认的解决方案，但由于我必须添加一些代码才能使其工作，这里是：

# fetch binary from server
async def fetch(url):
    async with aiohttp.ClientSession() as session:
        async with session.get(url) as response:
            async for chunk in response.content.iter_chunked(4096):
                yield chunk

# write binary function
async def save_file(file_name, chunk_iter):
    list(map(create_dir_tree, list_binary_sub_dirs))
    async with aiofiles.open(f'./binary/bin_ts/{file_name}', 'wb') as f:
        async for chunk in chunk_iter:
            await f.write(chunk)
    

async def main(urls):
    tasks = []
    for url in urls:
        print('running on sublist')
        file_name = url.rpartition('/')[-1]
        request_ts = fetch(url)
        tasks.append(save_file(file_name, request_ts))
    await asyncio.gather(*tasks)

asyncio.run(main(some_list_of_urls))

Python aiohttp 模块：不明确的 .content 属性

问题描述

2 个解决方案

解决方案1
2 2020-10-17 12:18:02

解决方案2
1 已采纳 2020-10-17 18:26:56

Python aiohttp 模块：不明确的 .content 属性

问题描述

2 个解决方案

解决方案1 2 2020-10-17 12:18:02

解决方案2 1 已采纳 2020-10-17 18:26:56

解决方案1
2 2020-10-17 12:18:02

解决方案2
1 已采纳 2020-10-17 18:26:56