Nim：如何改善並發異步響應時間和配額以匹配 cpython asyncio？

Question

對於今年即將開展的項目，我想研究一些我還沒有真正使用過但反復引起我興趣的語言。 尼姆就是其中之一。

我編寫了以下代碼來發出異步請求：

import asyncdispatch, httpclient, strformat, times, strutils

let urls = newHttpClient().getContent("https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt").splitLines()[0..99]

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  try:
    result = await client.getContent(url)
    echo &"{url} - response length: {len(result)}"
  except Exception as e:
    echo &"Error: {url} - {e.name}"

proc requestUrls(urls: seq[string]) {.async.} =
  let start = epochTime()
  echo "Starting requests..."

  var futures: seq[Future[string]]
  for url in urls:
    var client = newAsyncHttpClient()
    futures.add client.getHttpResp(&"http://www.{url}")
  for i in 0..urls.len-1:
    discard await futures[i]

  echo &"Requested {len(urls)} websites in {epochTime() - start}."

waitFor requestUrls(urls)

結果做了一些循環：

Iterations: 10. Total errors: 94.
Average time to request 100 websites: 9.98s.

完成的應用程序將僅請求單個資源。 因此，例如，當請求 Google 搜索查詢時（為簡單起見，只是從 1 到 100 的數字），結果如下所示：

Iterations: 1. Total errors: 0.
Time to request 100 google searches: 3.75s.

與Python相比，還是有明顯區別的：

import asyncio, time, requests
from aiohttp import ClientSession

urls = requests.get(
  "https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt"
).text.split('\n')


async def getHttpResp(url: str, session: ClientSession):
  try:
    async with session.get(url) as resp:
      result = await resp.read()
      print(f"{url} - response length: {len(result)}")
  except Exception as e:
    print(f"Error: {url} - {e.__class__}")


async def requestUrls(urls: list[str]):
  start = time.time()
  print("Starting requests...")

  async with ClientSession() as session:
    await asyncio.gather(*[getHttpResp(f"http://www.{url}", session) for url in urls])

  print(f"Requested {len(urls)} websites in {time.time() - start}.")


# await requestUrls(urls) # jupyter
asyncio.run(requestUrls(urls))

結果：

Iterations: 10. Total errors: 10.
Average time to request 100 websites: 7.92s.

僅請求谷歌搜索查詢時：

Iterations: 1. Total errors: 0.
Time to request 100 google searches: 1.38s.

此外：將單個響應與個人 URL 進行比較並僅獲取響應狀態代碼時，響應時間的差異仍然存在。 （我不太喜歡 python，但在使用它時，C 庫提供的功能通常令人印象深刻。）

為了改進 Nim 代碼，我認為可能值得嘗試添加通道和多個客戶端（這是我在 Nim 編程的第二天從仍然非常有限的角度來看 + 通常沒有太多並發請求的經驗） . 但我還沒有真正弄清楚如何讓它發揮作用。

在 nim 示例中對同一端點執行大量請求（例如，在進行谷歌搜索時），如果重復執行此數量的谷歌搜索，也可能會導致Too Many Requests錯誤。 在 python 中似乎並非如此。

因此，如果您可以分享您的方法來改善響應配額和請求時間，那就太好了！

如果有人想要一個用於克隆和修補的 repo，這個包含帶有循環的示例： https://github.com/tobealive/nim-async-requests-example

Answer 1

我試着記住 Nims async 是如何工作的，不幸的是在你的代碼中沒有看到真正的問題。 使用 -d:release 編譯似乎沒有太大區別。 一種想法是超時，對於 Python 可能有所不同。從https://nim-lang.org/docs/httpclient.html#timeouts我們了解到異步沒有超時，因此非常慢的頁面可能會保持連接開了很長時間。 也許 Python 會超時？ 我無法測試 Python 模塊，我的盒子上缺少 aiohttp。 下面是我的測試，和你的沒什么不同。 我通過使用 waitFor all(f) 使 main() 不是異步的。 抱歉，我不能真正幫助你，也許你真的應該試試 chronos 變體。

# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]

const
  UrlSource = "https://gist.githubusercontent.com/tobealive/" & 
    "b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
    "100-popular-urls.txt"

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  try:
    result = await client.getContent(url)
    echo &"{url} - response length: {len(result)}"
  except Exception as e:
    echo &"Error: {url} - {e.name}"

proc main =
  let start = epochTime()
  echo "Starting requests..."
  var urls = newHttpClient().getContent(UrlSource).splitLines
  if urls.len > 100: # in case that there are more than 100, clamp it 
    urls.setLen(100)
  # urls.setLen(3) # for fast tests with only a few urls
  var f: seq[Future[string]]
  for url in urls:
    let client = newAsyncHttpClient()
    f.add(client.getHttpResp(&"http://www.{url}"))
  let res: seq[string] = waitFor all(f)
  for x in res:
    echo x.len
  echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."

main()

使用上述程序的擴展版本進行測試，我感覺總傳輸速率僅限於幾 MB/s，我關於超時的想法是非常錯誤的。 我做了一些關於這個主題的谷歌搜索，找不到太多有用的信息。 正如您在最初的帖子中所寫的那樣，來自標准庫的 Nim 異步不是並行的，但（理論上）可以將它與多線程一起使用。 當我有更多空閑時間時，我可能會使用 Chronos 進行測試。

# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]

const
  UrlSource = "https://gist.githubusercontent.com/tobealive/" & 
    "b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
    "100-popular-urls.txt"

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  let start = epochTime()
  try:
    result = await client.getContent(url)
    stdout.write &"{url} - response length: {len(result)}"
  except Exception as e:
    stdout.write &"Error: {url} - {e.name}"
  echo fmt" --- Request took {epochTime() - start:.2f} seconds."

proc main =
  var transferred: int = 0
  let start = epochTime()
  echo "Starting requests..."
  var urls = newHttpClient().getContent(UrlSource).splitLines
  if urls.len > 100: # in case that there are more than 100, clamp it 
    urls.setLen(100)
  #urls.setLen(3) # for fast tests with only a few urls
  var f: seq[Future[string]]
  for url in urls:
    let client = newAsyncHttpClient()
    f.add(client.getHttpResp(&"http://www.{url}"))
  let res: seq[string] = waitFor all(f)
  for x in res:
    transferred += x.len
  echo fmt"Sum of transferred data: {transferred} bytes. ({transferred.float / (1024 * 1024).float / (epochTime() - start):.2f} MBytes/s)"
  echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."

main()

參考：

https://xmonader.github.io/nimdays/day04_asynclinkschecker.html

Nim：如何改善並發異步響應時間和配額以匹配 cpython asyncio？

問題描述

1 個解決方案

解決方案1
1 2023-01-06 23:22:51

Nim：如何改善並發異步響應時間和配額以匹配 cpython asyncio？

問題描述

1 個解決方案

解決方案1 1 2023-01-06 23:22:51

解決方案1
1 2023-01-06 23:22:51