簡體   English   中英

Nim:如何改善並發異步響應時間和配額以匹配 cpython asyncio?

[英]Nim: How can I improve concurrent async response time and quota to match cpythons asyncio?

對於今年即將開展的項目,我想研究一些我還沒有真正使用過但反復引起我興趣的語言。 尼姆就是其中之一。

我編寫了以下代碼來發出異步請求:

import asyncdispatch, httpclient, strformat, times, strutils

let urls = newHttpClient().getContent("https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt").splitLines()[0..99]

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  try:
    result = await client.getContent(url)
    echo &"{url} - response length: {len(result)}"
  except Exception as e:
    echo &"Error: {url} - {e.name}"

proc requestUrls(urls: seq[string]) {.async.} =
  let start = epochTime()
  echo "Starting requests..."

  var futures: seq[Future[string]]
  for url in urls:
    var client = newAsyncHttpClient()
    futures.add client.getHttpResp(&"http://www.{url}")
  for i in 0..urls.len-1:
    discard await futures[i]

  echo &"Requested {len(urls)} websites in {epochTime() - start}."

waitFor requestUrls(urls)

結果做了一些循環:

Iterations: 10. Total errors: 94.
Average time to request 100 websites: 9.98s.

完成的應用程序將僅請求單個資源。 因此,例如,當請求 Google 搜索查詢時(為簡單起見,只是從 1 到 100 的數字),結果如下所示:

Iterations: 1. Total errors: 0.
Time to request 100 google searches: 3.75s.

與Python相比,還是有明顯區別的:

import asyncio, time, requests
from aiohttp import ClientSession

urls = requests.get(
  "https://gist.githubusercontent.com/tobealive/b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/100-popular-urls.txt"
).text.split('\n')


async def getHttpResp(url: str, session: ClientSession):
  try:
    async with session.get(url) as resp:
      result = await resp.read()
      print(f"{url} - response length: {len(result)}")
  except Exception as e:
    print(f"Error: {url} - {e.__class__}")


async def requestUrls(urls: list[str]):
  start = time.time()
  print("Starting requests...")

  async with ClientSession() as session:
    await asyncio.gather(*[getHttpResp(f"http://www.{url}", session) for url in urls])

  print(f"Requested {len(urls)} websites in {time.time() - start}.")


# await requestUrls(urls) # jupyter
asyncio.run(requestUrls(urls))

結果:

Iterations: 10. Total errors: 10.
Average time to request 100 websites: 7.92s.

僅請求谷歌搜索查詢時:

Iterations: 1. Total errors: 0.
Time to request 100 google searches: 1.38s.

此外:將單個響應與個人 URL 進行比較並僅獲取響應狀態代碼時,響應時間的差異仍然存在。 (我不太喜歡 python,但在使用它時,C 庫提供的功能通常令人印象深刻。)


為了改進 Nim 代碼,我認為可能值得嘗試添加通道和多個客戶端(這是我在 Nim 編程的第二天從仍然非常有限的角度來看 + 通常沒有太多並發請求的經驗) . 但我還沒有真正弄清楚如何讓它發揮作用。

在 nim 示例中對同一端點執行大量請求(例如,在進行谷歌搜索時),如果重復執行此數量的谷歌搜索,也可能會導致Too Many Requests錯誤。 在 python 中似乎並非如此。

因此,如果您可以分享您的方法來改善響應配額和請求時間,那就太好了!

如果有人想要一個用於克隆和修補的 repo,這個包含帶有循環的示例: https://github.com/tobealive/nim-async-requests-example

我試着記住 Nims async 是如何工作的,不幸的是在你的代碼中沒有看到真正的問題。 使用 -d:release 編譯似乎沒有太大區別。 一種想法是超時,對於 Python 可能有所不同。從https://nim-lang.org/docs/httpclient.html#timeouts我們了解到異步沒有超時,因此非常慢的頁面可能會保持連接開了很長時間。 也許 Python 會超時? 我無法測試 Python 模塊,我的盒子上缺少 aiohttp。 下面是我的測試,和你的沒什么不同。 我通過使用 waitFor all(f) 使 main() 不是異步的。 抱歉,我不能真正幫助你,也許你真的應該試試 chronos 變體。

# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]

const
  UrlSource = "https://gist.githubusercontent.com/tobealive/" & 
    "b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
    "100-popular-urls.txt"

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  try:
    result = await client.getContent(url)
    echo &"{url} - response length: {len(result)}"
  except Exception as e:
    echo &"Error: {url} - {e.name}"

proc main =
  let start = epochTime()
  echo "Starting requests..."
  var urls = newHttpClient().getContent(UrlSource).splitLines
  if urls.len > 100: # in case that there are more than 100, clamp it 
    urls.setLen(100)
  # urls.setLen(3) # for fast tests with only a few urls
  var f: seq[Future[string]]
  for url in urls:
    let client = newAsyncHttpClient()
    f.add(client.getHttpResp(&"http://www.{url}"))
  let res: seq[string] = waitFor all(f)
  for x in res:
    echo x.len
  echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."

main()

使用上述程序的擴展版本進行測試,我感覺總傳輸速率僅限於幾 MB/s,我關於超時的想法是非常錯誤的。 我做了一些關於這個主題的谷歌搜索,找不到太多有用的信息。 正如您在最初的帖子中所寫的那樣,來自標准庫的 Nim 異步不是並行的,但(理論上)可以將它與多線程一起使用。 當我有更多空閑時間時,我可能會使用 Chronos 進行測試。

# nim r -d:ssl -d:release t.nim
import std/[asyncdispatch, httpclient, strutils, strformat, times]

const
  UrlSource = "https://gist.githubusercontent.com/tobealive/" & 
    "b2c6e348dac6b3f0ffa150639ad94211/raw/31524a7aac392402e354bced9307debd5315f0e8/" &
    "100-popular-urls.txt"

proc getHttpResp(client: AsyncHttpClient, url: string): Future[string] {.async.} =
  let start = epochTime()
  try:
    result = await client.getContent(url)
    stdout.write &"{url} - response length: {len(result)}"
  except Exception as e:
    stdout.write &"Error: {url} - {e.name}"
  echo fmt" --- Request took {epochTime() - start:.2f} seconds."

proc main =
  var transferred: int = 0
  let start = epochTime()
  echo "Starting requests..."
  var urls = newHttpClient().getContent(UrlSource).splitLines
  if urls.len > 100: # in case that there are more than 100, clamp it 
    urls.setLen(100)
  #urls.setLen(3) # for fast tests with only a few urls
  var f: seq[Future[string]]
  for url in urls:
    let client = newAsyncHttpClient()
    f.add(client.getHttpResp(&"http://www.{url}"))
  let res: seq[string] = waitFor all(f)
  for x in res:
    transferred += x.len
  echo fmt"Sum of transferred data: {transferred} bytes. ({transferred.float / (1024 * 1024).float / (epochTime() - start):.2f} MBytes/s)"
  echo fmt"Requested {len(urls)} websites in {epochTime() - start:.2f} seconds."

main()

參考:

https://xmonader.github.io/nimdays/day04_asynclinkschecker.html

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM