导致100％CPU使用率飙升的Python GET请求

Question

I'm using the python requests library (version 2.4.1) for performing a simple get request, code is below, nothing fancy here. 我正在使用python请求库（版本2.4.1）执行简单的get请求，下面是代码，这里没什么花哨的地方。 On most website's there are no issues. 在大多数网站上都没有问题。 But on some websites, one in particular www.pricegrabber.com, I experience 100% CPU usage and the code never moves past the point of the get request. 但是在某些网站上，尤其是在一个网站www.pricegrabber.com上，我遇到了100％的CPU使用率，并且代码始终不会超出get请求的范围。 No timeout occurs, nothing, just a huge CPU usage spike that never stops. 不会发生超时，什么也不会发生，只是一个巨大的CPU使用率峰值，而且永远不会停止。

import requests
url = 'http://www.pricegrabber.com'
r = requests.get(url, timeout=(1, 1))
print 'SUCESS'
print r

Answer 1

Using python 2.7, the latest stable version of the 'requests' library, and enabling logging as shown in this answer indicates that the HTTP request is stuck in a redirect loop. 使用python 2.7（“请求”库的最新稳定版本）并启用此答案中所示的日志记录，表明HTTP请求卡在了重定向循环中。

INFO:requests.packages.urllib3.connectionpool:Starting new HTTP connection (1): www.pricegrabber.com INFO：requests.packages.urllib3.connectionpool：启动新的HTTP连接（1）：www.pricegrabber.com
DEBUG:requests.packages.urllib3.connectionpool:"GET / HTTP/1.1" 301 20 调试：requests.packages.urllib3.connectionpool：“ GET / HTTP / 1.1” 301 20
DEBUG:requests.packages.urllib3.connectionpool:"GET /index.php/ut=43bb2597a77557f5 HTTP/1.1" 301 20 调试：requests.packages.urllib3.connectionpool：“ GET /index.php/ut=43bb2597a77557f5 HTTP / 1.1” 301 20
DEBUG:requests.packages.urllib3.connectionpool:"GET /?ut=43bb2597a77557f5 HTTP/1.1" 301 20 调试：requests.packages.urllib3.connectionpool：“ GET /？ut = 43bb2597a77557f5 HTTP / 1.1” 301 20
DEBUG:requests.packages.urllib3.connectionpool:"GET /?ut=43bb2597a77557f5 HTTP/1.1" 301 20 调试：requests.packages.urllib3.connectionpool：“ GET /？ut = 43bb2597a77557f5 HTTP / 1.1” 301 20
DEBUG:requests.packages.urllib3.connectionpool:"GET /?ut=43bb2597a77557f5 HTTP/1.1" 301 20 调试：requests.packages.urllib3.connectionpool：“ GET /？ut = 43bb2597a77557f5 HTTP / 1.1” 301 20

... ...

This continues a bit until: 这将继续直到：

requests.exceptions.TooManyRedirects: Exceeded 30 redirects. request.exceptions.TooManyRedirects：超过30个重定向。

And the code I used to discover this: 我用来发现这一点的代码：

#!/usr/bin/env python

import logging
import requests

logging.basicConfig(level=logging.DEBUG)

url = 'http://www.pricegrabber.com'
r = requests.get(url, timeout=(1, 1))

print 'SUCCESS'
print r

导致100％CPU使用率飙升的Python GET请求

问题描述

1 个解决方案

解决方案1
4 已采纳 2014-12-04 17:24:54

导致100％CPU使用率飙升的Python GET请求

问题描述

1 个解决方案

解决方案1 4 已采纳 2014-12-04 17:24:54

解决方案1
4 已采纳 2014-12-04 17:24:54