简体   繁体   English

Pytrends:请求失败:Google 返回了代码为 429 的响应

[英]Pytrends: The request failed: Google returned a response with code 429

I'm using Pytrends to extract Google trends data, like:我正在使用 Pytrends 来提取 Google 趋势数据,例如:

from pytrends.request import TrendReq
pytrend = TrendReq()
pytrend.build_payload(kw_list=['bitcoin'], cat=0, timeframe=from_date+' '+today_date)

And it returns an error:它返回一个错误:

ResponseError: The request failed: Google returned a response with code 429.

I made it yesterday and for some reason it doesn't work now!我昨天成功了,但由于某种原因,它现在不起作用了! The source code from github failed too:来自 github 的源代码也失败了:

pytrends = TrendReq(hl='en-US', tz=360, proxies = {'https': 'https://34.203.233.13:80'})

How can I fix this?我怎样才能解决这个问题? Thanks a lot!非常感谢!

This one took a while but it turned out the library just needed an update.这花了一段时间,但事实证明库只需要更新。 You can check out a few of the approaches I posted here, both of which resulted in Status 429 Responses:您可以查看我在此处发布的一些方法,这两种方法都会导致状态 429 响应:

https://github.com/GeneralMills/pytrends/issues/243 https://github.com/GeneralMills/pytrends/issues/243

Ultimately, I was able to get it working again by running the following command from my bash prompt:最终,通过从 bash 提示符运行以下命令,我能够使其再次工作:

Run:跑:

pip install --upgrade --user git+https://github.com/GeneralMills/pytrends

For the latest version.对于最新版本。

Hope that works for you too.希望这对你也有用。

EDIT:编辑:

If you can't upgrade from source you may have some luck with:如果您无法从源代码升级,您可能会遇到以下问题:

pip install pytrends --upgrade

Also, make sure you're running git as an administrator if on Windows.另外,如果在 Windows 上,请确保您以管理员身份运行 git。

I had the same problem even after updating the module with pip install --upgrade --user git+https://github.com/GeneralMills/pytrends and restart python.即使在使用pip install --upgrade --user git+https://github.com/GeneralMills/pytrends更新模块并重新启动 python 后,我也遇到了同样的问题。

But, the issue was solved via the below method:但是,该问题已通过以下方法解决:

Instead of代替

pytrends = TrendReq(hl='en-US', tz=360, timeout=(10,25), proxies=['https://34.203.233.13:80',], retries=2, backoff_factor=0.1, requests_args={'verify':False})

Just ran:刚跑:

pytrend = TrendReq()

Hope this can be helpful!希望这会有所帮助!

TLDR; TLDR; I solved the problem with a custom patch我用自定义补丁解决了这个问题

Explanation解释

The problem comes from the Google bot recognition system.问题来自谷歌机器人识别系统。 As other similar systems do, it stops serving too frequent requests coming from suspicious clients.与其他类似系统一样,它停止为来自可疑客户端的过于频繁的请求提供服务。 Some of the features used to recognize trustworthy clients are the presence of specific headers generated by the javascript code present on the web pages.用于识别可信客户端的一些功能是存在由网页上的 javascript 代码生成的特定标头。 Unfortunately, the python requests library does not provide such a level of camouflage against those bot recognition systems since javascript code is not even executed.不幸的是,由于 javascript 代码甚至没有被执行,python requests 库并没有对那些机器人识别系统提供这种级别的伪装。 So the idea behind my patch is to leverage the headers generated by my browser interacting with google trends.所以我的补丁背后的想法是利用我的浏览器与谷歌趋势交互生成的标题。 Those headers are generated by the browser meanwhile I am logged in using my Google account, in other words, those headers are linked with my google account, so for them, I am trustworthy.这些标题是在我使用我的 Google 帐户登录的同时由浏览器生成的,换句话说,这些标题与我的 google 帐户相关联,因此对它们来说,我是值得信赖的。

Solution解决方案

I solved in the following way:我通过以下方式解决了:

  1. First of all you must use google trends from your web browser meanwhile you are logged in with your Google Account;首先,您必须在使用 Google 帐户登录的同时从网络浏览器使用 google 趋势;
  2. In order to track the actual HTTP GET made: (I am using Chromium) Go into "More Tools" -> "Developers Tools" -> "Network" tab.为了跟踪实际的 HTTP GET:(我正在使用 Chromium)进入“更多工具”->“开发人员工具”->“网络”选项卡。
  3. Visit the Google Trend page and perform a search for a trend;访问谷歌趋势页面并搜索趋势; it will trigger a lot of HTTP requests on the left sidebar of the "Network" tab;它会在“网络”选项卡的左侧边栏触发大量 HTTP 请求;
  4. Identify the GET request (in my case it was /trends/explore?q=topic&geo=US) and right-click on it and select Copy -> Copy as cURL;识别 GET 请求(在我的例子中是 /trends/explore?q=topic&geo=US)并右键单击它并选择 Copy -> Copy as cURL;
  5. Then go to this page and paste the cURL script on the left side and copy the "headers" dictionary you can find inside the python script generated on the right side of the page;然后转到此页面并将cURL脚本粘贴到左侧并复制您可以在页面右侧生成的python脚本中找到的“headers”字典;
  6. Then go to your code and subclass the TrendReq class, so you can pass the custom header just copied:然后转到您的代码并将 TrendReq 类子类化,这样您就可以传递刚刚复制的自定义标头:
from pytrends.request import TrendReq as UTrendReq
GET_METHOD='get'

import requests

headers = {
...
}


class TrendReq(UTrendReq):
    def _get_data(self, url, method=GET_METHOD, trim_chars=0, **kwargs):
        return super()._get_data(url, method=GET_METHOD, trim_chars=trim_chars, headers=headers, **kwargs)

  1. Remove any "import TrendReq" from your code since now it will use this you just created;从您的代码中删除任何“导入 TrendReq”,因为现在它将使用您刚刚创建的;
  2. Retry again;再试一次;
  3. If in any future the error message comes back: repeat the procedure.如果将来再次出现错误消息:重复该过程。 You need to update the header dictionary with fresh values and it might happen you will have to solve a captcha mechanism.您需要使用新值更新标题字典,并且可能需要解决验证码机制。

通过 pip install 运行升级命令后,您应该重新启动 python 内核并重新加载 pytrend 库。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM