[英]Controlling the concurrency of HTTP requests using Python's asyncio.Semaphore
我試圖找出一種方法來限制使用 Python 的asyncio和httpx模塊向服務器發出的並發 HTTP 請求的數量。 我遇到了這個 StackOverflow答案。
它提出了asyncio.Semaphore
來阻止多個消費者發出過多的請求。 雖然這個答案完美無缺,但它使用顯式循環構造,而不是asyncio.run
。 當我用asyncio.run
替換顯式循環構造時,代碼的行為會發生變化。 現在它只執行三個請求然后停止,而不是執行所有 9 個請求。
import asyncio
from random import randint
async def download(code):
wait_time = randint(1, 3)
print('downloading {} will take {} second(s)'.format(code, wait_time))
await asyncio.sleep(wait_time) # I/O, context will switch to main function
print('downloaded {}'.format(code))
sem = asyncio.Semaphore(3)
async def safe_download(i):
async with sem: # semaphore limits num of simultaneous downloads
return await download(i)
async def main():
tasks = [
asyncio.ensure_future(safe_download(i)) # creating task starts coroutine
for i
in range(9)
]
await asyncio.gather(*tasks, return_exceptions=True) # await moment all downloads done
if __name__ == '__main__':
asyncio.run(main())
這打印出來:
downloading 0 will take 3 second(s)
downloading 1 will take 1 second(s)
downloading 2 will take 3 second(s)
downloaded 1
downloaded 0
downloaded 2
我必須將await asyncio.gather(*tasks)
更改為await asyncio.gather(*tasks, return_exceptions=True)
以便代碼不會拋出RuntimeError
。 否則它會拋出這個錯誤,我已經打開了 asyncio 調試模式。
downloading 0 will take 2 second(s)
downloading 1 will take 3 second(s)
downloading 2 will take 1 second(s)
Traceback (most recent call last):
File "/home/rednafi/workspace/personal/demo/demo.py", line 66, in <module>
asyncio.run(main())
File "/usr/lib/python3.9/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/usr/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/home/rednafi/workspace/personal/demo/demo.py", line 62, in main
await asyncio.gather(*tasks) # await moment all downloads done
File "/home/rednafi/workspace/personal/demo/demo.py", line 52, in safe_download
async with sem: # semaphore limits num of simultaneous downloads
File "/usr/lib/python3.9/asyncio/locks.py", line 14, in __aenter__
await self.acquire()
File "/usr/lib/python3.9/asyncio/locks.py", line 413, in acquire
await fut
RuntimeError: Task <Task pending name='Task-5' coro=<safe_download() running at /home/rednafi/workspace/personal/demo/demo.py:52> cb=[gather.<locals>._done_callback() at /usr/lib/python3.9/asyncio/tasks.py:764] created at /home/rednafi/workspace/personal/demo/demo.py:58> got Future <Future pending created at /usr/lib/python3.9/asyncio/base_events.py:424> attached to a different loop
但是,唯一的其他更改是用asyncio.run
替換顯式循環。
問題是為什么代碼的行為發生了變化? 我怎樣才能恢復舊的預期行為?
問題是在頂層創建的Semaphore
緩存了在其創建期間處於活動狀態的事件循環(由 asyncio 自動創建並在啟動時由get_event_loop()
返回的事件循環)。 另一方面, asyncio.run()
會在每次運行時創建一個新的事件循環。 結果,您試圖等待來自不同事件循環的信號量,但失敗了。 與往常一樣,隱藏異常而不了解其原因只會導致進一步的問題。
要正確解決此問題,您應該在asyncio.run()
中創建信號量。 例如,最簡單的修復可能如下所示:
# ...
sem = None
async def main():
global sem
sem = asyncio.Semaphore(3)
# ...
一種更優雅的方法是從頂層完全刪除sem
並將其顯式傳遞給safe_download
:
async def safe_download(i, limit):
async with limit:
return await download(i)
async def main():
# limit parallel downloads to 3 at most
limit = asyncio.Semaphore(3)
# you don't need to explicitly call create_task() if you call
# `gather()` because `gather()` will do it for you
await asyncio.gather(*[safe_download(i, limit) for i in range(9)])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.