简体   繁体   English

扭曲给twisted.web.client.PartialDownloadError:200 OK

[英]Twisted giving twisted.web.client.PartialDownloadError: 200 OK

I have the following code snippet, slightly modified from the original docs . 我有以下代码片段,相对于原始docs做了一些修改。 The code works properly when url is set to http://google.com . url设置为http://google.com时,代码可以正常工作。 But it crashes when this is changed to http://www.google.com . 但是当将其更改为http://www.google.com时,它会崩溃。 The error upon crashing is Failure: twisted.web.client.PartialDownloadError: 200 OK . 崩溃时的错误是Failure: twisted.web.client.PartialDownloadError: 200 OK The traceback is below the code snippet. 追溯位于代码段下方。

Initially I thought that perhaps the code was crashing due to not handling SSL properly. 最初,我认为代码可能由于未正确处理SSL而崩溃。 But, looking at the headers this doesn't appear to be the issue. 但是,查看标题似乎不是问题。 This is my first time ever working with Twisted; 这是我第一次与Twisted合作; I don't know what else could be causing the problem. 我不知道还有什么可能导致问题。

Code

from sys import argv
from pprint import pformat
from twisted.internet.task import react
from twisted.web.client import Agent, BrowserLikeRedirectAgent, readBody
from twisted.web.http_headers import Headers
from twisted.internet import reactor
from twisted.internet.ssl import ClientContextFactory

responses = []

class WebClientContextFactory(ClientContextFactory):
    def getContext(self, hostname, port):
        return ClientContextFactory.getContext(self)

def cbBody(r):
    print 'Response body:'
    print r
    responses.append(r)

def cbRequest(response):
    print 'Response version:', response.version
    print 'Response code:', response.code
    print 'Response phrase:', response.phrase
    print 'Response headers:'
    print pformat(list(response.headers.getAllRawHeaders()))
    d = readBody(response)
    d.addCallback(cbBody)
    return d

def main(reactor):
    contextFactory = WebClientContextFactory()
    agent = BrowserLikeRedirectAgent(Agent(reactor, contextFactory))
    url=b"http://google.com/"
    agent = Agent(reactor, contextFactory)
    d = agent.request(
        'GET', url,
        Headers({'User-Agent': ['Twisted Web Client Example']}),
        None)
    d.addCallback(cbRequest)
    return d

react(main)

Traceback 追溯

In [1]: %tb
---------------------------------------------------------------------------
SystemExit                                Traceback (most recent call last)
/usr/local/lib/python2.7/site-packages/IPython/utils/py3compat.pyc in execfile(fname, glob, loc, compiler)
    218             else:
    219                 scripttext = builtin_mod.open(fname).read().rstrip() + '\n'
--> 220                 exec(compiler(scripttext, filename, 'exec'), glob, loc)
    221
    222

/project/demo.py in <module>()
     42     return d
     43
---> 44 react(main)

/usr/local/lib/python2.7/site-packages/twisted/internet/task.pyc in react(main, argv, _reactor)
    902     finished.addBoth(cbFinish)
    903     _reactor.run()
--> 904     sys.exit(codes[0])
    905
    906

SystemExit: 1

It shouldn't be too surprising that requests for different URLs produce different responses. 对不同URL的请求产生不同的响应也就不足为奇了。 The URLs identify different resources. URL标识不同的资源。 You should probably expect to get different responses when requesting different resources. 请求不同的资源时,您可能应该期望得到不同的响应。

The reason you get a PartialDownloadError when you request http://www.google.com/ is that Google is sending a response with neither a Content-Length nor Transfer-Encoding: chunked in it. 当您请求http://www.google.com/时,出现PartialDownloadError的原因是Google发送的响应既不包含Content-Length也不包含Transfer-Encoding: chunked This means the only way for the client to know when the response has been received is when the TCP connection is closed. 这意味着客户端知道接收到响应的唯一方法是关闭TCP连接。 Unfortunately, TCP connections can close for other reasons - so it is ambiguous whether a response is ever fully received. 不幸的是,TCP连接可能由于其他原因而关闭-因此是否完全接收到响应一直是模棱两可的。

Google seems to be framing the response this way in response to the particular details of how Agent issues the request. Google似乎是通过这种方式来设计响应,以响应Agent如何发出请求的特定细节。 Google responds with Transfer-Encoding: chunked to requests made by other agents. Google会以Transfer-Encoding: chunked响应Transfer-Encoding: chunked给其他代理提出的请求。

One option to address this is to decide you don't care if responses are truncated without your knowledge. 解决此问题的一种方法是,决定您不在乎响应是否在您不知情的情况下被截断。 In this case, add an errback to the readBody Deferred that handles PartialDownloadError . 在这种情况下,将errback添加到处理PartialDownloadErrorreadBody Deferred The exception has a response attribute giving you the data that was read up until the TCP connection closed. 异常的response属性为您提供了在TCP连接关闭之前已读取的数据。 Grab that data and return it and now you've converted the maybe-failed case into a who-cares-pretend-it-succeeded case. 抓取数据并将其返回,现在您将可能失败的案例转换为谁在乎的假装成功案例。

Another option is to try fiddling with the details of the request until you convince Google to give you a Transfer-Encoding: chunked (or at least a Content-Length ). 另一种选择是尝试摆弄请求的详细信息,直到您说服Google为您提供Transfer-Encoding: chunked (或至少为Content-Length )为止。 Of course, this solution breaks as soon as you meet another server that doesn't feel like giving you one or the other of these. 当然,当您遇到另一台不希望给您其中一个或另一个的服务器时,此解决方案就会中断。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM