[英]Nim: How can I improve concurrent async response time and quota to match cpythons asyncio?
對於今年即將開展的項目,我想研究一些我還沒有真正使用過但反復引起我興趣的語言。 尼姆就是其中之一。
我編寫了以下代碼來發出異步請求:
import asyncdispatch, httpclient, strformat, times, strutils
let urls = newHttpClient().getContent("https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt").splitLines()[0..99]
proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
try:
result = await client.getContent(url)
echo &"{url} - response length: {len(result)}"
except Exception as e:
echo &"Error: {url} - {e.name}"
proc requestUrls(urls: seq[string]) {.async.} =
let start = epochTime()
echo "Starting requests..."
var futures: seq[Future[string]]
for url in urls:
var client = newAsyncHttpClient()
futures.add client.getHttpResp(&"http://www.{url}")
for i in 0..urls.len-1:
discard await futures[i]
echo &"Requested {len(urls)} websites in {epochTime() - start}."
waitFor requestUrls(urls)
結果做了一些循環:
Iterations: 10. Total errors: 94.
Average time to request 100 websites: 9.98s.
完成的應用程序將僅請求單個資源。 因此,例如,當請求 Google 搜索查詢時(為簡單起見,只是從 1 到 100 的數字),結果如下所示:
Iterations: 1. Total errors: 0.
Time to request 100 google searches: 3.75s.
與Python相比,還是有明顯區別的:
import asyncio, time, requests
from aiohttp import ClientSession
urls = requests.get(
"https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt"
).text.split('\n')
async def getHttpResp(url: str, session: ClientSession):
try:
async with session.get(url) as resp:
result = await resp.read()
print(f"{url} - response length: {len(result)}")
except Exception as e:
print(f"Error: {url} - {e.__class__}")
async def requestUrls(urls: list[str]):
start = time.time()
print("Starting requests...")
async with ClientSession() as session:
await asyncio.gather(*[getHttpResp(f"http://www.{url}", session) for url in urls])
print(f"Requested {len(urls)} websites in {time.time() - start}.")
# await requestUrls(urls) # jupyter
asyncio.run(requestUrls(urls))
結果:
Iterations: 10. Total errors: 10.
Average time to request 100 websites: 7.92s.
僅請求谷歌搜索查詢時:
Iterations: 1. Total errors: 0.
Time to request 100 google searches: 1.38s.
此外:將單個響應與個人 URL 進行比較並僅獲取響應狀態代碼時,響應時間的差異仍然存在。 (我不太喜歡 python,但在使用它時,C 庫提供的功能通常令人印象深刻。)
為了改進 Nim 代碼,我認為可能值得嘗試添加通道和多個客戶端(這是我在 Nim 編程的第二天從仍然非常有限的角度來看 + 通常沒有太多並發請求的經驗) . 但我還沒有真正弄清楚如何讓它發揮作用。
在 nim 示例中對同一端點執行大量請求(例如,在進行谷歌搜索時),如果重復執行此數量的谷歌搜索,也可能會導致Too Many Requests
錯誤。 在 python 中似乎並非如此。
因此,如果您可以分享您的方法來改善響應配額和請求時間,那就太好了!
如果有人想要一個用於克隆和修補的 repo,這個包含帶有循環的示例: https://github.com/tobealive/nim-async-requests-example
我試着記住 Nims async 是如何工作的,不幸的是在你的代碼中沒有看到真正的問題。 使用 -d:release 編譯似乎沒有太大區別。 一種想法是超時,對於 Python 可能有所不同。從https://nim-lang.org/docs/httpclient.html#timeouts我們了解到異步沒有超時,因此非常慢的頁面可能會保持連接開了很長時間。 也許 Python 會超時? 我無法測試 Python 模塊,我的盒子上缺少 aiohttp。 下面是我的測試,和你的沒什么不同。 我通過使用 waitFor all(f) 使 main() 不是異步的。 抱歉,我不能真正幫助你,也許你真的應該試試 chronos 變體。
# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]
const
UrlSource = "https://gist.githubusercontent.com/tobealive/" &
"b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
"100-popular-urls.txt"
proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
try:
result = await client.getContent(url)
echo &"{url} - response length: {len(result)}"
except Exception as e:
echo &"Error: {url} - {e.name}"
proc main =
let start = epochTime()
echo "Starting requests..."
var urls = newHttpClient().getContent(UrlSource).splitLines
if urls.len > 100: # in case that there are more than 100, clamp it
urls.setLen(100)
# urls.setLen(3) # for fast tests with only a few urls
var f: seq[Future[string]]
for url in urls:
let client = newAsyncHttpClient()
f.add(client.getHttpResp(&"http://www.{url}"))
let res: seq[string] = waitFor all(f)
for x in res:
echo x.len
echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."
main()
使用上述程序的擴展版本進行測試,我感覺總傳輸速率僅限於幾 MB/s,我關於超時的想法是非常錯誤的。 我做了一些關於這個主題的谷歌搜索,找不到太多有用的信息。 正如您在最初的帖子中所寫的那樣,來自標准庫的 Nim 異步不是並行的,但(理論上)可以將它與多線程一起使用。 當我有更多空閑時間時,我可能會使用 Chronos 進行測試。
# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]
const
UrlSource = "https://gist.githubusercontent.com/tobealive/" &
"b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
"100-popular-urls.txt"
proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
let start = epochTime()
try:
result = await client.getContent(url)
stdout.write &"{url} - response length: {len(result)}"
except Exception as e:
stdout.write &"Error: {url} - {e.name}"
echo fmt" --- Request took {epochTime() - start:.2f} seconds."
proc main =
var transferred: int = 0
let start = epochTime()
echo "Starting requests..."
var urls = newHttpClient().getContent(UrlSource).splitLines
if urls.len > 100: # in case that there are more than 100, clamp it
urls.setLen(100)
#urls.setLen(3) # for fast tests with only a few urls
var f: seq[Future[string]]
for url in urls:
let client = newAsyncHttpClient()
f.add(client.getHttpResp(&"http://www.{url}"))
let res: seq[string] = waitFor all(f)
for x in res:
transferred += x.len
echo fmt"Sum of transferred data: {transferred} bytes. ({transferred.float / (1024 * 1024).float / (epochTime() - start):.2f} MBytes/s)"
echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."
main()
參考:
https://xmonader.github.io/nimdays/day04_asynclinkschecker.html
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.