简体   繁体   English

替换python的httplib?

[英]A replacement for python's httplib?

I have a python client which pushes a great deal of data through the standard library's httlib. 我有一个python客户端,通过标准库的httlib推送大量数据。 Users are complainging that the application is slow. 用户抱怨应用程序运行缓慢。 I suspect that this may be partly due to the HTTP client I am using. 我怀疑这可能部分是由于我使用的HTTP客户端。

Could I improve performance by replacing httplib with something else? 我可以通过用其他东西替换httplib来提高性能吗?

I've seen that twisted offers a HTTP client. 我已经看到twisted提供了一个HTTP客户端。 It seems to be very basic compared to their other protocol offerings. 与其他协议产品相比,它似乎非常基础。

PyCurl might be a valid alternative, however it's use seems to be very un-pythonic, on the other hand if it's performance is really good then I can put up with a bit of un-pythonic code. PyCurl可能是一个有效的替代方案,但它的使用似乎非常非pythonic,另一方面,如果它的性能非常好,那么我可以忍受一些非pythonic代码。

So if you have experience of better HTTP client libraries of python please tell me about it. 所以,如果您有更好的Python客户端库经验,请告诉我。 I'd like to know what you thought of the performance relative to httplib and what you thought of the quality of implementation. 我想知道你对httplib的性能以及你对实现质量的看法。

UPDATE 0: My use of httplib is actually very limited - the replacement needs to do the following: 更新0:我对httplib的使用实际上非常有限 - 替换需要执行以下操作:

conn = httplib.HTTPConnection(host, port)
conn.request("POST", url, params, headers)
compressedstream = StringIO.StringIO(conn.getresponse().read())

That's all: No proxies, redirection or any fancy stuff. 这就是全部:没有代理,重定向或任何花哨的东西。 It's plain-old HTTP. 这是普通的HTTP。 I just need to be able to do it as fast as possible. 我只需要能够尽快完成它。

UPDATE 1: I'm stuck with Python2.4 and I'm working on Windows 32. Please do not tell me about better ways to use httplib - I want to know about some of the alternatives to httplib. 更新1:我坚持使用Python2.4并且我正在使用Windows 32.请不要告诉我有关使用httplib的更好方法 - 我想了解一些httplib的替代方法。

Often when I've had performance problems with httplib, the problem hasn't been with the httplib itself, but with how I'm using it. 通常当我遇到httplib的性能问题时,问题不在于httplib本身,而在于我如何使用它。 Here are a few common pitfalls: 以下是一些常见的陷阱:

(1) Don't make a new TCP connection for every web request. (1)不要为每个Web请求建立新的TCP连接。 If you are making lots of request to the same server, instead of this pattern: 如果您向同一服务器发出大量请求,而不是此模式:

conn = httplib.HTTPConnection("www.somewhere.com")
    conn.request("GET", '/foo')
    conn = httplib.HTTPConnection("www.somewhere.com")
    conn.request("GET", '/bar')
    conn = httplib.HTTPConnection("www.somewhere.com")
    conn.request("GET", '/baz')

Do this instead: 改为:

conn = httplib.HTTPConnection("www.somewhere.com")
    conn.request("GET", '/foo')
    conn.request("GET", '/bar')
    conn.request("GET", '/baz')

(2) Don't serialize your requests. (2)不要序列化您的请求。 You can use threads or asynccore or whatever you like, but if you are making multiple requests from different servers, you can improve performance by running them in parallel. 您可以使用线程或asynccore或任何您喜欢的,但如果您从不同的服务器发出多个请求,您可以通过并行运行来提高性能。

Users are complainging that the application is slow. 用户抱怨应用程序运行缓慢。 I suspect that this may be partly due to the HTTP client I am using. 我怀疑这可能部分是由于我使用的HTTP客户端。

Could I improve performance by replacing httplib with something else? 我可以通过用其他东西替换httplib来提高性能吗?

Do you suspect it or are you sure that that it's httplib ? 怀疑它还是你确定它是httplib Profile before you do anything to improve the performance of your app. 在您执行任何操作之前的配置文件,以提高应用程序的性能

I've found my own intuition on where time is spent is often pretty bad (given that there isn't some code kernel executed millions of times). 我已经找到了自己的直觉,花时间往往非常糟糕(假设没有一些代码内核执行数百万次)。 It's really disappointing to implement something to improve performance then pull up the app and see that it made no difference. 实现某些东西以提高性能然后拉起应用程序并发现它没有任何区别真是令人失望。

If you're not profiling, you're shooting in the dark! 如果你没有剖析,你就是在黑暗中拍摄!

PyCurl非常棒,性能极高。

httplib2 is another option: http://code.google.com/p/httplib2/ httplib2是另一种选择: http//code.google.com/p/httplib2/

I have never benchmarked or profiled it in comparison to httplib, but I would also be interested in any findings there. 与httplib相比,我从未对它进行基准测试或分析,但我也对那里的任何发现感兴趣。


Dec. 2012 update: I no longer use httplib2. 2012年12月更新:我不再使用httplib2。 now using Requests : HTTP For Humans, for any http with Python. 现在使用Requests :HTTP For Humans,任何带Python的http。

You seem to assume its the library. 你似乎假设它的库。 Its open source, so it would be worth checking the code to see if it is. 它的开源,所以值得检查代码是否值得。

You mention that you're sending a lot of data over HTTP. 你提到你通过HTTP发送大量数据。 The inefficieny might be because of the library, but HTTP isn't the most efficient protocol for sending large amounts of data. 效率低下可能是因为库,但HTTP不是发送大量数据的最有效协议。 Then again, it could be the simple use of the library (are you sending a big string or list, or using a stream or generators?). 然后,它可能是简单使用库(您发送一个大字符串或列表,或使用流或生成器?)。

As others answered httplib2 is a good alternative because it handles headers properly and can cache responses, but I doubt this would help in POST performance. 正如其他人所说,httplib2是一个很好的选择,因为它可以正确处理标头并可以缓存响应,但我怀疑这对POST性能有帮助。

An alternative that might actually give you a performance boost for POST, especially on Windows, is the new HTTP 1.1 client in Twisted.web 实际上可以为POST提供性能提升的替代方案,特别是在Windows上,是Twisted.web中新的HTTP 1.1客户端

httplib2 is a very good option. httplib2是一个非常好的选择。 Joe Gregorio has fixed many bugs of httplib. Joe Gregorio修复了许多httplib错误。

It works on my windows machine: With Py 2.3 (without IPv6 support) this is only the IPv4 address, but with Py 2.4-2.6 the order is (on my Win XP host) the IPv6 address first, then the IPv4 address. 它适用于我的Windows机器:使用Py 2.3(没有IPv6支持)这只是IPv4地址,但是使用Py 2.4-2.6,顺序是(在我的Win XP主机上)首先是IPv6地址,然后是IPv4地址。 Since the IPv6 address is checked first, this gives a timeout and causes the slow connect() call. 由于首先检查IPv6地址,因此会产生超时并导致慢速connect()调用。

I have only changed "localhost" to 127.0.0.1 and it started working 10 times faster (from 1087ms to 87ms). 我只将“localhost”更改为127.0.0.1,它开始工作速度提高了10倍(从1087ms到87ms)。 Solution from http://www.velocityreviews.com/forums/t668272-problem-with-slow-httplib-connections-on-windows-and-maybe-otherplatforms.html 解决方案来自http://www.velocityreviews.com/forums/t668272-problem-with-slow-httplib-connections-on-windows-and-maybe-otherplatforms.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM