简体   繁体   English

Twython速率限制问题

[英]Twython Rate Limit Issue

I am wondering how I can automate my program to fetch tweets at the max rate of 180 requests per 15 minutes, which is equivalent to the max count of 100 per request totaling 18,000 tweets. 我想知道如何自动化我的程序以每15分钟180个请求的最大速率获取推文,这相当于每个请求的最大数量为100个,总共18,000个推文。 I am creating this program for an independent case study at school. 我正在为学校的独立案例研究创建这个项目。

I would like my program to avoid being rate limited and end up being terminated. 我希望我的程序避免受到速率限制并最终被终止。 So, what I would like it to do is constantly use the max number of requests per 15 minutes and be able to leave it running for 24 hours without user interaction to retrieve all tweets possible for analysis. 因此,我希望它能够持续使用每15分钟的最大请求数,并且能够让它在没有用户交互的情况下运行24小时,以检索所有可能用于分析的推文。

Here is my code. 这是我的代码。 It gets tweets of query and puts it into a text file but eventually gets rate limited. 它获取查询的推文,并将其放入文本文件,但最终得到速率限制。 Would really appreciate the help 真的很感激帮助

import logging
import time
import csv
import twython
import json

app_key = ""
app_secret = ""
oauth_token = ""
oauth_token_secret = ""

twitter = twython.Twython(app_key, app_secret, oauth_token, oauth_token_secret)

tweets = []
MAX_ATTEMPTS = 1000000
# Max Number of tweets per 15 minutes
COUNT_OF_TWEETS_TO_BE_FETCHED = 18000 

for i in range(0,MAX_ATTEMPTS):

    if(COUNT_OF_TWEETS_TO_BE_FETCHED < len(tweets)):
    break

    if(0 == i):
        results = twitter.search(q="$AAPL",count='100',lang='en',)

    else:
        results = twitter.search(q="$AAPL",include_entities='true',max_id=next_max_id)

    for result in results['statuses']:
        print result

        with open('tweets.txt', 'a') as outfile:
             json.dump(result, outfile, sort_keys = True, indent = 4)

    try:
        next_results_url_params = results['search_metadata']['next_results']
        next_max_id = next_results_url_params.split('max_id=')[1].split('&')[0]
    except:

        break

You should be using Twitter's Streaming API . 你应该使用Twitter的Streaming API

This will allow you to receive a near-realtime feed of your search. 这样您就可以获得近乎实时的搜索Feed。 You can write those tweets to a file just as fast as they come in. 您可以将这些推文尽可能快地写入文件。

Using the track parameter you will be able to receive only the specific tweets you're interested in. 使用track参数,您将只能收到您感兴趣的特定推文。

You'll need to use Twython Streamer - and your code will look something like this: 你需要使用Twython Streamer - 你的代码看起来像这样:

from twython import TwythonStreamer

class MyStreamer(TwythonStreamer):
    def on_success(self, data):
        if 'text' in data:
            print data['text'].encode('utf-8')

    def on_error(self, status_code, data):
        print status_code

stream = MyStreamer(APP_KEY, APP_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET)
stream.statuses.filter(track='$AAPL')

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM