简体   繁体   English

python 中的 HTTP1.1/HTTP2 基准测试

[英]benchmark HTTP1.1/HTTP2 in python

I did some benchmark on http1.1/http2 with python, the code is as simple as repeating a Google search request many times.我用 python 在 http1.1/http2 上做了一些基准测试,代码就像多次重复 Google 搜索请求一样简单。 The result is interesting: the http2 version is slower considerably.结果很有趣:http2 版本的速度要慢得多。 (i tried both pycurl/httpx libraries) Can someone explain why this happens? (我尝试了两个 pycurl/httpx 库)有人可以解释为什么会发生这种情况吗?

Update: this is the httpx version code: (first pip install httpx[http2] )更新:这是 httpx 版本代码:(首先pip install httpx[http2]

import time
import httpx
client = httpx.Client(http2=True)
start = time.time()
for _ in range(100):
    response = client.get("https://www.google.com/search?q=good")
    response.raise_for_status()
print(time.time() - start)

So it's important to understand what HTTP/2 aims to solve and what it doesn't.因此,了解 HTTP/2 旨在解决什么以及不解决什么是很重要的。

HTTP/2 aims to be more efficient for multiple requests to the same site by adding multiplexing to the HTTP protocol. HTTP/2 旨在通过向 HTTP 协议添加多路复用来更高效地处理对同一站点的多个请求。 Basically HTTP/1.1 blocks the whole connection while a request is in process, but HTTP/2 doesn't — so allows other requests to be made during this time.基本上,HTTP/1.1 在请求进行时会阻塞整个连接,但 HTTP/2 不会——因此允许在此期间发出其他请求。

What this means, is that a single request (like you are doing) is no better under HTTP/2 than it is under HTTP/1.1.这意味着,单个请求(就像您正在做的那样)在 HTTP/2 下并不比在 HTTP/1.1 下更好。 In fact it may even be slightly slower due to some extra set-up messages sent at the beginning of each HTTP/2 connection, which aren't needed under HTTP/1.1.事实上,由于在每个 HTTP/2 连接开始时发送了一些额外的设置消息,它甚至可能会稍微慢一些,而这些消息在 HTTP/1.1 下是不需要的。 Though I'm surprised if this difference was noticeable to be honest, so can you give more details of how much of a slowness there was?虽然说实话,如果这种差异很明显,我很惊讶,那么你能详细说明一下有多慢吗? It may also point to less efficient code in the HTTP/2 implementation you are using.它还可能指向您正在使用的 HTTP/2 实现中效率较低的代码。 Can you share the code?可以分享一下代码吗?

Even in the browser context, if you look at a highly optimised site like Google's home page (probably the most visited page on the internet, and run by a company that knows a LOT about the web and how to optimise web pages), then you may also not see a difference.即使在浏览器上下文中,如果您查看高度优化的网站,例如 Google 的主页(可能是 Internet 上访问量最大的页面,并且由一家非常了解 web 以及如何优化 web 页面的公司运营),那么您也可能看不出区别。 The Google home page basically renders in a single request as all critical requests are inlined to make it as fast as possible — no matter whether using HTTP/1.1 or HTTP/2.谷歌主页基本上呈现在一个请求中,因为所有关键请求都被内联以使其尽可能快——无论是使用 HTTP/1.1 还是 HTTP/2。

However a typical page loaded in the browser involving tens or even hundreds of requests the advantage of HTTP/2 is often very noticable.然而,在浏览器中加载的典型页面涉及数十甚至数百个请求,HTTP/2 的优势通常非常明显。

And if you take an extreme site with lots of small requests (which is what HTTP/2 really excels at,), then the difference is really noticeable.如果你选择一个有很多小请求的极端站点(这是 HTTP/2 真正擅长的),那么差异真的很明显。

***** Edit, looking at the good you provided **** *****编辑,看看你提供的好东西****

Regarding your particular test case I was able to repeat this for Google, but not for other sites.关于您的特定测试用例,我可以为 Google 重复此操作,但不能为其他站点重复此操作。

The difference seems too great to be due to HTTP/1.1 and HTTP/2 so I suspected that HTTP/1.1 was reusing the connection, but HTTP/2 was not.由于 HTTP/1.1 和 HTTP/2 的差异似乎太大了,所以我怀疑 HTTP/1.1 正在重用连接,但 HTTP/2 不是。 Moving the connection setup into the for loop gave the same slow results for both, and similar to previous HTTP/2 timings seemingly confirming this.将连接设置移动到 for 循环中会产生同样缓慢的结果,并且类似于之前的 HTTP/2 时间安排似乎证实了这一点。

import time
import httpx
start = time.time()
for _ in range(100): 
    client = httpx.Client(http2=True)
    response = client.get("https://www.google.com/search?q=good")
    response.raise_for_status()
print(time.time() - start)

Similarly changing keepalives to 0 also slowed HTTP/1.1 down to match HTTP/2:同样,将 keepalives 更改为 0 也会减慢 HTTP/1.1 以匹配 HTTP/2:

client = httpx.Client(http2=False,limits=httpx.Limits(max_keepalive_connections=0))

Keepalive is no longer a concept in HTTP/2 (connections are kept alive by default until the client deems it no longer necessary to keep around). Keepalive 在 HTTP/2 中不再是一个概念(连接默认保持活动状态,直到客户端认为不再需要保持活动状态)。

So this seems to be a problem with httpx's HTTP/1 handling (they do note it is experimental), rather than a problem with the protocol itself.所以这似乎是 httpx 的 HTTP/1 处理的问题(他们确实注意到这是实验性的),而不是协议本身的问题。

Finally moving to this code style, seemed to bring HTTP/2 stats back into line with HTTP/1.1:最后转向这种代码风格,似乎使 HTTP/2 统计数据重新与 HTTP/1.1 保持一致:

import time
import httpx

with httpx.Client(http2=False) as client:
    start = time.time()
    for _ in range(100):
        response = client.get("https://www.google.com/search?q=good")
        response.raise_for_status()
    print(time.time() - start)

But at this point.但此时。 Google was getting bored of me spamming their servers and returning a 429 Client Error: Too Many Requests... error. Google 厌倦了我向他们的服务器发送垃圾邮件并返回429 Client Error: Too Many Requests...错误。

When I tried to repeat the same issues on my own server, and then on stackoverflow.com, I couldn't - HTTP/1.1 and HTTP/2 were similar speeds.当我尝试在自己的服务器上重复相同的问题,然后在 stackoverflow.com 上,我做不到 - HTTP/1.1 和 HTTP/2 的速度相似。 Not sure if Google was doing some caching on their side if same connection which helped speed it up.如果相同的连接有助于加快速度,不确定谷歌是否在他们身边做一些缓存。

Anyway, point is, this seems to be an implementation specific issue and not something to do with the HTTP protocol itself.无论如何,重点是,这似乎是一个特定于实现的问题,与 HTTP 协议本身无关。

https://github.com/dalf/pyhttp-benchmark might help in a way or another. https://github.com/dalf/pyhttp-benchmark可能会有所帮助。

See:看:

TLDR: with httpx, when using http2 TLDR:使用 httpx,使用 http2 时

  • avoid large content (>64kb).避免大内容(> 64kb)。
  • avoid sequential requests避免顺序请求
  • prefer parallel requests更喜欢并行请求

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM