简体   繁体   中英

Google Trends - Quota limit - IP address changer

I am writing a code using the Unofficial Google Trends API ( https://github.com/GeneralMills/pytrends#trend ), however after hardly 10 requests I have got the following error: Exceeded Google's Rate Limit. Please use time.sleep() to space requests. Exceeded Google's Rate Limit. Please use time.sleep() to space requests.

It seems that the following command does not properly connect to Google services.

  pytrends = TrendReq(google_username, google_password, custom_useragent=None)

Therefore I tried to change my IP address along with Tor Browser as explained here: https://stackoverflow.com/a/34516846/7110706

controller = Controller.from_port(port=9151)

def connectTor():
    socks.setdefaultproxy(socks.PROXY_TYPE_SOCKS5 , "127.0.0.1", 9150, True)
    socket.socket = socks.socksocket

def renew_tor():
    controller.authenticate()
    controller.signal(Signal.NEWNYM)

def showmyip():
    url = "http://www.showmyip.gr/"
    r = requests.Session()
    page = r.get(url)
    soup = BeautifulSoup(page.content, "lxml")
    ip_address = soup.find("span",{"class":"ip_address"}).text.strip()
    print('New IP adress is:' + ip_address)

There main issue is in the following code:

def requestDailydatafromGT(keywords, geography, date):  #parameters must be strings 
    from pytrends.request import TrendReq
    import time
    from random import randint 

    google_username = ""  #put your gmail account
    google_password = ""
    path = ""

    #Connect to google
    pytrend = TrendReq(google_username, google_password, custom_useragent=None)

    requestdate=str(date)+' 3m'

    trend_payload = {'q': keywords,'hl': 'en-US','geo': geography, 'date': requestdate} #define parameters of the request
    mes=0

    while mes==0:
        try:
            results= pytrend.trend(trend_payload, return_type='dataframe').sort_index(axis=0, ascending=False) #launch request in Google tren0ds
            mes=1

        except Exception:
            renew_tor()
            connectTor()
            time.sleep(randint(5,15))
            mes=0

    return results

The code seems to work as the IP address changes over the time, however I am still stuck with Google request quota limit error:

Exceeded Google's Rate Limit. Please use time.sleep() to space requests.

New IP address is : 178.217.187.39

Exceeded Google's Rate Limit. Please use time.sleep() to space requests.

New IP address is: 95.128.43.164

Do you know if there is a way to bypass the limitation? Perhaps Google Trends does not get the new IP address as the request is not correctly routed by thor.

Thanks in advance.

Have you already tried to (re)connect to Google inside the while loop?

while mes == 0:
    pytrend = TrendReq(google_username, google_password, custom_useragent=None) # Connect to google
    try:
        results = pytrend.trend(trend_payload, return_type='dataframe').sort_index(axis=0, ascending=False) # Launch request in Google Trends
        mes = 1

UPDATE 1: As told me by the OP, my solution works only if a random user-agent is used.

Therefore something like the following code should work:

def random_word(length):
"""Return a random word of 'length' letters."""
return ''.join(random.choice(string.ascii_letters) for i in range(length))

[...]

def requestDailydatafromGT(keywords, geography, date):  #parameters must be strings 
    [...]
    while mes == 0:
        pytrend = TrendReq(google_username, google_password, custom_useragent=random_word(8)) # Connect to Google
        try:
            results = pytrend.trend(trend_payload, return_type='dataframe').sort_index(axis=0, ascending=False) # Launch request in Google Trends
            mes = 1
    [...]

UPDATE 2: There is no need to authenticate every time you renew Tor. You can simply do it once after the controller creation.

controller = Controller.from_port(port=9051)
controller.authenticate(<YOUR_TOR_CONTROL_PASSWORD>)

As an additional information, standard ports should be:

Tor: 9050 | Tor control: 9051

Tor Browser: 9150 | Tor Browser control: 9151

I used 9050 and 9051 ports after having un-commented "ControlPort 9051" in default Tor configuration file (and having added my hashed password).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM