简体繁体 English

谷歌抓取杀死地方 api 配额

[英]Google crawl kills places api quota

原文 2016-03-27 17:38:29 9 2 php

We have a few API's on our site, one is a places API.我们的网站上有一些 API，一个是地方 API。 When google spider crawls our site it will hit the quota for our places API.当谷歌蜘蛛抓取我们的网站时，它会达到我们地点 API 的配额。

I have reset the API over and over and its getting very tiring!我一遍又一遍地重置 API，它变得非常累人！

I also set my site to run 3 different API projects with the same APIs (google places) and used logic to make it use up one, switch to the next ect ect however;我还将我的站点设置为使用相同的 API（谷歌位置）运行 3 个不同的 API 项目，并使用逻辑使其用完一个，但是切换到下一个等； Even after now having 450,000 calls per day, by noon google search spider has killed all 3 API's!!!即使现在每天有 450,000 次调用，到中午谷歌搜索蜘蛛已经杀死了所有 3 个 API ！！！

this now makes it so that my users can no longer use any section that uses the places API, this is a HUGE problem!!!现在这使得我的用户不能再使用任何使用地方 API 的部分，这是一个巨大的问题！！！ i am not being charged for the google hitting google API calls, however it is destroying the users experience at my site and will not be tolerated!我不会因为 google 命中 google API 调用而被收费，但是它正在破坏我网站上的用户体验，并且不会被容忍！

Please help right away!请立即帮助！

I imagine it rests within googles hands to fix this bug within their system, there is really nothing i can personally do as you have read above that i have done everything i can for my users experience when visiting my site.我想这取决于谷歌在他们的系统中修复这个错误，我个人真的无能为力，正如你在上面阅读的那样，我已经为我的用户在访问我的网站时的体验做了我所能做的一切。

2 个解决方案

这不是他们系统中的错误，如果您有成千上万个唯一的 URL 都进行 API 调用，并且您没有阻止使用robots.txt （请参阅此处）抓取它们，那么这就是您网站中的错误。

I ended up solves this in a work around way, for anyone else having this issue here is what i did.我最终以一种变通的方式解决了这个问题，对于其他遇到此问题的人来说，这就是我所做的。

1) i have 3 API projects set up, each can make 150,000 calls a day 1) 我设置了 3 个 API 项目，每个项目每天可以进行 150,000 次调用

2) i have logic set up to look and see if the page is being accessed from a sprider like google bot 2）我设置了逻辑来查看页面是否被像谷歌机器人这样的蜘蛛访问

3) if the session is coming from a spider, the 3rd API key is set to be null 3) 如果会话来自蜘蛛，则第三个 API 密钥设置为空

4)the system trys to use each API one by one, if the first result set is empty it tries number 2, then if 2 is empty it tries number 3 4）系统尝试一个一个地使用每个API，如果第一个结果集为空则尝试编号2，如果2为空则尝试编号3

5) because the 3rd API key is set to null if a spider, this allows for those 150,000 calls to be set aside for a user, but now we have to stop crawl from crawling blank content 5) 因为如果是蜘蛛，第三个 API 键被设置为 null，这允许为用户留出 150,000 次调用，但现在我们必须停止抓取空白内容

6) in the logic block that switches from trying API 1, then 2, then 3 I made php rewrite my robots.txt file, if API 1 is usable i set this: 6) 在从尝试 API 1、然后 2、然后 3 切换的逻辑块中，我让 php 重写了我的 robots.txt 文件，如果 API 1 可用，我将设置为：

file_put_contents('robots.txt', 'User-agent: * Disallow: '); file_put_contents('robots.txt', 'User-agent: * Disallow: ');

same for API 2, if API 3 is being used then i rewrite the robots.txt to become: API 2 相同，如果正在使用 API 3，那么我将 robots.txt 重写为：

file_put_contents('robots.txt', 'User-agent: * Disallow: /'); file_put_contents('robots.txt', 'User-agent: * Disallow: /');

this now has set aside 150,000 calls for users, the spiders can not use these 150,000 calls, and at the point that the other 300,000 calls have been exhausted, the site can no longer be crawled for the rest of the day by any spiders.现在已经为用户预留了15万个调用，这15万个调用蜘蛛是用不上的，到了另外30万个调用用完的时候，网站就不能再被任何蜘蛛爬行了。

problem solved!问题解决了！ told you id fix it myself if they couldnt.告诉你，如果他们不能，我会自己修理它。

oh and another note, because its google using google API's im not being charged for the 300,000 calls that google kills, i only get charged for that real users use up......pure perfection!哦还有另一个注意事项，因为它使用 google API 的 google 没有为 google 杀死的 300,000 个调用付费，我只为真正的用户用完付费......纯粹完美！