使用Python / urllib2处理rss重定向

Question

Calling urrlib2.urlopen on a link to an article fetched from an RSS feed leads to the following error: 在指向从RSS源提取的文章的链接上调用urrlib2.urlopen会导致以下错误：

urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error tha t would lead to an infinite loop. urllib2.HTTPError：HTTP错误301：HTTP服务器返回重定向错误，导致无限循环。 The last 30x error message was: Moved Permanently 最后30x错误消息是：永久移动

According to the documentation, urllib2 supports redirects. 根据文档，urllib2支持重定向。

On Java the problem was solved by just calling 在Java上，问题通过调用解决了

HttpURLConnection.setFollowRedirects(true);

How can I solve it with Python? 我怎样才能用Python解决它？

UPDATE UPDATE

The link I'm having problems with: 我遇到问题的链接：

http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c

Answer 1

Turns out you need to enable Cookies. 原来你需要启用Cookies。 The page redirects to itself after setting a cookie first. 首先设置cookie后，页面会重定向到自身。 Because urllib2 does not handle cookies by default you have to do it yourself. 因为默认情况下urllib2不处理cookie，所以你必须自己动手。

import urllib2
import urllib
from cookielib import CookieJar

cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
p = opener.open("http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c")

print p.read()

Answer 2

Nothing wrong with @sleeplessnerd's solution, but this is very, very slightly more elegant: @ sleeplessnerd的解决方案没有任何问题，但这非常非常优雅：

import urllib2
url = "http://stackoverflow.com/questions/9926023/handling-rss-redirects-with-python-urllib2"
p = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(url)

print p.read()

In fact, if you look at the inline documentation for the CookieJar() function, it more-or-less tells you to do things this way: 事实上，如果你看一下CookieJar()函数的内联文档，它或多或少会告诉你这样做：

You may not need to know about this class: try urllib2.build_opener(HTTPCookieProcessor).open(url)

使用Python / urllib2处理rss重定向

问题描述

2 个解决方案

解决方案1
26 已采纳 2012-03-29 14:31:21

解决方案2
9 2013-10-07 12:17:40

使用Python / urllib2处理rss重定向

问题描述

2 个解决方案

解决方案1 26 已采纳 2012-03-29 14:31:21

解决方案2 9 2013-10-07 12:17:40

解决方案1
26 已采纳 2012-03-29 14:31:21

解决方案2
9 2013-10-07 12:17:40