简体   繁体   English

从Google App Engine调用Reddit api时出现错误429

[英]Error 429 when invoking Reddit api from Google App Engine

I have been running a cron job on Google App Engine for over a month now without any issues. 我已经在Google App Engine上运行了一个多月的cron工作,没有任何问题。 The job does a variety of things, one being that it uses urllib2 to make a call to retrieve a json response from Reddit as well as a few other sites. 这项工作做了很多事情,一个是它使用urllib2来调用从Reddit以及其他一些站点检索json响应。 About two weeks ago I started seeing errors when invoking Reddit, but no errors when invoking the other sites. 大约两周前,我开始在调用Reddit时看到错误,但在调用其他站点时没有错误。 The error I am receiving is HTTP error 429. 我收到的错误是HTTP错误429。

I have tried executing the same code outside of Google App Engine and do not have any issues. 我尝试在Google App Engine之外执行相同的代码并且没有任何问题。 I tried using urlFetch, but receive the same error. 我尝试使用urlFetch,但收到相同的错误。

You can see the error when using the app engine's interactive shell with the following code. 使用应用程序引擎的交互式shell时,您可以看到错误,其中包含以下代码。

import urllib2
data = urllib2.urlopen('http://www.reddit.com/r/Music/.json', timeout=60)

Edit: Not sure why it always fails for me and not someone else. 编辑:不确定为什么它总是失败对​​我而不是别人。 This is the error that I receive: 这是我收到的错误:

>>> import urllib2
>>> data = urllib2.urlopen('http://www.reddit.com/r/Music/.json', timeout=60)
Traceback (most recent call last):
  File "/base/data/home/apps/s~shell-27/1.356011914885973647/shell.py", line 267, in get
    exec compiled in statement_module.__dict__
  File "<string>", line 1, in <module>
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 400, in open
    response = meth(req, response)
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 513, in http_response
    'http', request, response, code, msg, hdrs)
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 438, in error
    return self._call_chain(*args)
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 372, in _call_chain
    result = func(*args)
  File "/base/python27_runtime/python27_dist/lib/python2.7/urllib2.py", line 521, in http_error_default
    raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 429: Unknown

similar code running outside of app engine with no problem: 在app引擎外部运行的类似代码没有问题:

print urllib2.urlopen('http://www.reddit.com/r/Music/.json').read()

At first I thought it had to do with a timeout problem since it was originally working, but since there is not a timeout error but a the strange HttpError code, I'm not sure. 起初我认为它与超时问题有关,因为它最初工作,但由于没有超时错误但是一个奇怪的HttpError代码,我不确定。 Any ideas? 有任何想法吗?

Reddit rate limits the api pretty severely for the default user agent for the python shell. 对于python shell的默认用户代理,Reddit速率严格限制了api。 You need to set a unique user agent with your reddit username in it, like this: 您需要在其中设置一个唯一的用户代理,其中包含您的reddit用户名,如下所示:

User-Agent: super happy flair bot by /u/spladug 用户代理:/ u / spladug超级快乐的天赋机器人

More info about the reddit api here https://github.com/reddit/reddit/wiki/API . 有关reddit api的更多信息,请访问https://github.com/reddit/reddit/wiki/API

It's possible that Reddit is counting calls based on IP - which means that other applications on GAE which share your IP might already be exhausting the quota. Reddit可能会计算基于IP的呼叫 - 这意味着GAE上与您共享IP的其他应用程序可能已经耗尽了配额。

This might get better if you use Reddit API keys (I don't know if they issue them) or if they agree to rate limit API calls based on the app header. 如果您使用Reddit API密钥(我不知道他们是否发布它们)或者他们是否同意根据应用程序标头对API限制进行评级,这可能会更好。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM