简体   繁体   English

Python机械化阻止连接:关闭

[英]Python Mechanize Prevent Connection:Close

I'm trying to use mechanize to get information from a web page. 我正在尝试使用机械化从网页获取信息。 It's basically succeeding in getting the first bit of information, but the web page includes a button for "Next" to get more information. 它基本上可以成功获取第一批信息,但是网页上包含一个“下一步”按钮,用于获取更多信息。 I can't figure out how to programmatically get the additional information. 我不知道如何以编程方式获取其他信息。

By using Live HTTP Headers, I can see the http request that is generated when I click the next button within a browser. 通过使用实时HTTP标头,我可以看到单击浏览器中的下一个按钮时生成的http请求。 It seems as if I can issue the same request using mechanize, but in the latter case, instead of getting the next page, I am redirected to the home page of the website. 好像我可以使用机械化发出相同的请求,但是在后一种情况下,我没有转到下一页,而是重定向到网站的主页。

Obviously, mechanize is doing something different than my browser is, but I can't figure out what. 显然,机械化所做的事情与我的浏览器有所不同,但是我不知道该做什么。 In comparing the headers, I did find one difference, which was the browser used 在比较标题时,我确实发现了一个区别,那就是所使用的浏览器

Connection: keep-alive 连接:保持活动状态

while mechanize used 机械化时

Connection: close 连接方式:关闭

I don't know if that's the culprit, but when I tried to add the header ('Connection','keep-alive'), it didn't change anything. 我不知道这是否是罪魁祸首,但是当我尝试添加标头(“ Connection”,“ keep-alive”)时,它什么都没有改变。

[UPDATE] When I click the button for "page 2" within Firefox, the generated http is (according to Live HTTP Headers): [更新]当我在Firefox中单击“第2页”按钮时,生成的http是(根据实时HTTP标头):

GET /statistics/movies/ww_load/the-fast-and-the-furious-6-2012?authenticity_token=ItU38334Qxh%2FRUW%2BhKoWk2qsPLwYKDfiNRoSuifo4ns%3D&facebook_fans_page=2&tbl=facebook_fans&authenticity_token=ItU38334Qxh%2FRUW%2BhKoWk2qsPLwYKDfiNRoSuifo4ns%3D HTTP/1.1
Host: www.boxoffice.com
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:18.0) Gecko/20100101 Firefox/18.0
Accept: text/javascript, text/html, application/xml, text/xml, */*
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
X-Requested-With: XMLHttpRequest
X-Prototype-Version: 1.6.0.3
Referer: http://www.boxoffice.com/statistics/movies/the-fast-and-the-furious-6-2012
Cookie: __utma=179025207.1680379428.1359475480.1360001752.1360005948.13; __utmz=179025207.1359475480.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); __qca=P0-668235205-1359475480409; zip=13421; country_code=US; _boxoffice_session=2202c6a47fc5eb92cd0ba57ef6fbd2c8; __utmc=179025207; user_credentials=d3adbc6ecf16c038fcbff11779ad16f528db8ebd470befeba69c38b8a107c38e9003c7977e32c28bfe3955909ddbf4034b9cc396dac4615a719eb47f49cc9eac%3A%3A15212; __utmb=179025207.2.10.1360005948
Connection: keep-alive

When I try to request the same url within mechanize, it looks like this: 当我尝试在机械化中请求相同的URL时,它看起来像这样:

GET /statistics/movies/ww_load/the-fast-and-the-furious-6-2012?facebook_fans_page=2&tbl=facebook_fans&authenticity_token=ZYcZzBHD3JPlupj%2F%2FYf4dQ42Kx9ZBW1gDCBuJ0xX8X4%3D HTTP/1.1
Accept-Encoding: identity
Host: www.boxoffice.com
Accept: text/javascript, text/html, application/xml, text/xml, */*
Keep-Alive: 115
Connection: close
Cookie: _boxoffice_session=ced53a0ca10caa9757fd56cd89f9983e; country_code=US; zip=13421; user_credentials=d3adbc6ecf16c038fcbff11779ad16f528db8ebd470befeba69c38b8a107c38e9003c7977e32c28bfe3955909ddbf4034b9cc396dac4615a719eb47f49cc9eac%3A%3A15212
Referer: http://www.boxoffice.com/statistics/movies/the-fast-and-the-furious-6-2012
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1

-- Daryl -达里尔

服务器正在检查X-Requested-With和/或X-Prototype-Version ,因此将这两个标头添加到机械化请求中即可对其进行修复。

Maybe a little late with an answer but i fixed this by adding an line in _urllib2_forked.py 也许答案有点晚了,但是我通过在_urllib2_forked.py中添加一行来解决了这个问题

on line 1098 stands the line: headers["Connection"] = "Close" 1098行上的行如下: headers["Connection"] = "Close"

Change this to: 更改为:

if not 'Connection' in headers: headers["Connection"] = "Close" and make sure you set the header in you script and it will work. if not 'Connection' in headers: headers["Connection"] = "Close" ,并确保在脚本中设置了标题,那么它将起作用。

Gr. Gr。 Squandor 方形

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM