简体   繁体   中英

Gracefully handle errors and exceptions for user_timeline method in Tweepy

I'm collecting tweets for a big number of users, so the script will run for days/weeks unsupervised. I have a list of user_ids in big_list . I think some of the tweets are private and my script stops so I'd like a way for the script to continue on to the next user_id (and maybe print a warning message).

I'd also like suggestions on how to make it robust to other errors or exceptions (for example, for the script to sleep on error or timeout)

This is a summary of what I have:

import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
my_api = tweepy.API(auth)

for id_str in big_list:
    all_tweets = get_all_tweets(id_str=id_str, api=my_api)
    #Here: insert some tweets into my database

The get_all_tweets function throws the errors and it basically repeatedly calls:

my_api.user_timeline(user_id = id_str, count=200)

Just in case, the traceback it gives is the following:

/home/username/anaconda/lib/python2.7/site-packages/tweepy/binder.pyc in execute(self)
    201                 except Exception:
    202                     error_msg = "Twitter error response: status code = %s" % resp.status
--> 203                 raise TweepError(error_msg, resp)
    204 
    205             # Parse the response payload

TweepError: Not authorized.

Let me know if you need more details. Thanks!

----------- EDIT --------

This question has some info.

I guess I can try to do a try/except block for different type of errors? I don't know of all the relevant, so best practices of someone with field experience would be appreciated!

---------- EDIT 2 -------

I'm getting some Rate limit exceeded errors so I'm making the loop sleep like this. The else part would handle the "Not authorized" error and some other (unknown?) errors. This still makes me loose an element in the big_list though.

for id_str in big_list:
    try:
        all_tweets = get_all_tweets(id_str=id_str, api=my_api)
        # HERE: save tweets
    except tweepy.TweepError, e:
        if e == "[{u'message': u'Rate limit exceeded', u'code': 88}]":
            time.sleep(60*5) #Sleep for 5 minutes
        else:
            print e

You may just do a "pass" :

import tweepy
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
my_api = tweepy.API(auth)

for id_str in big_list:
    try:
        all_tweets = get_all_tweets(id_str=id_str, api=my_api)
    except Exception, e:
         pass

I am really late on this but I faced the same problem during these days. For the need of a time.sleep() I solved the problem thanks to alecxe reply to this question .

I am diving in the past but I hope this will help someone in the future.

You can just use "wait_on_ratelimit" and "wait_on_rate_limit_notify" of Tweepy when you create the API object and add a general tweepy error handling, then, with the specific erros shown, you may try to personalize the code handling each error. It should be something like this:

import tweepy 
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret) 
my_api = tweepy.API(auth, wait_on_rate_limit = True, wait_on_rate_limit_notify = True)

for id_str in big_list: 
    try: 
        all_tweets = get_all_tweets(id_str=id_str, api=my_api) 
    except tweepy.TweepError as e: 
        print("Tweepy Error: {}".format(e))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM