简体   繁体   English

使用Python / urllib2处理rss重定向

[英]Handling rss redirects with Python/urllib2

Calling urrlib2.urlopen on a link to an article fetched from an RSS feed leads to the following error: 在指向从RSS源提取的文章的链接上调用urrlib2.urlopen会导致以下错误:

urllib2.HTTPError: HTTP Error 301: The HTTP server returned a redirect error tha t would lead to an infinite loop. urllib2.HTTPError:HTTP错误301:HTTP服务器返回重定向错误,导致无限循环。 The last 30x error message was: Moved Permanently 最后30x错误消息是:永久移动

According to the documentation, urllib2 supports redirects. 根据文档,urllib2支持重定向。

On Java the problem was solved by just calling 在Java上,问题通过调用解决了

HttpURLConnection.setFollowRedirects(true);

How can I solve it with Python? 我怎样才能用Python解决它?

UPDATE UPDATE

The link I'm having problems with: 我遇到问题的链接:

http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c

Turns out you need to enable Cookies. 原来你需要启用Cookies。 The page redirects to itself after setting a cookie first. 首先设置cookie后,页面会重定向到自身。 Because urllib2 does not handle cookies by default you have to do it yourself. 因为默认情况下urllib2不处理cookie,所以你必须自己动手。

import urllib2
import urllib
from cookielib import CookieJar

cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
p = opener.open("http://feeds.nytimes.com/click.phdo?i=8cd5af579b320b0bfd695ddcc344d96c")

print p.read()

Nothing wrong with @sleeplessnerd's solution, but this is very, very slightly more elegant: @ sleeplessnerd的解决方案没有任何问题,但这非常非常优雅:

import urllib2
url = "http://stackoverflow.com/questions/9926023/handling-rss-redirects-with-python-urllib2"
p = urllib2.build_opener(urllib2.HTTPCookieProcessor).open(url)

print p.read()

In fact, if you look at the inline documentation for the CookieJar() function, it more-or-less tells you to do things this way: 事实上,如果你看一下CookieJar()函数的内联文档,它或多或少会告诉你这样做:

You may not need to know about this class: try urllib2.build_opener(HTTPCookieProcessor).open(url)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM