如何使用 tweepy 和 twitter REST API 獲取帶有特定主題標簽的每條推文

Question

對於數據可視化項目，我需要收集所有帶有特定標簽的推文（這可能嗎？）。 為此，我使用下面的代碼。 它使用 Tweepy 和 REST API。 但是，它最多只能下載大約 2500 條推文或更少。 我想知道如何解決這個限制。 是否有專業訂閱或我應該購買的其他任何東西，或者我應該如何修改代碼。

#!/usr/bin/python
# -*- coding: utf-8 -*-
# this file is configured for rtl language and farsi characters

import sys
from key import *
import tweepy

#imported from the key.py file
API_KEY =KAPI_KEY
API_SECRET =KAPI_SECRET
OAUTH_TOKEN =KOAUTH_TOKEN
OAUTH_TOKEN_SECRET =KOAUTH_TOKEN_SECRET

auth = tweepy.AppAuthHandler(API_KEY, API_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True,
                 wait_on_rate_limit_notify=True)

if not api:
    print("Can't Authenticate")
    sys.exit(-1)

def write_unicode(text, charset='utf-8'):
    return text.encode(charset)

searchQuery = "#کرونا"  # this is what we're searching for
maxTweets = 100000  # Some arbitrary large number
tweetsPerQry = 100  # this is the max the API permits
fName = 'Corona-rest8.txt'  # We'll store the tweets in a text file.

sinceId = None

max_id = -1
tweetCount: int = 0
print("Downloading max {0} tweets".format(maxTweets))
with open(fName, "wb") as f:
    while tweetCount < maxTweets:
        try:
            if max_id <= 0:
                if not sinceId:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry)
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            since_id=sinceId)
            else:
                if not sinceId:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1))
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1),
                                            since_id=sinceId)
            if not new_tweets:
                print("No more tweets found")
                break
            for tweet in new_tweets:

                #print(tweet._json["created_at"])
                if str(tweet._json["user"]["location"])!="":
                    print(tweet._json["user"]["location"])
                myDict = json.dumps(tweet._json["text"], ensure_ascii=False).encode('utf8')+ "\n".encode('ascii')
                f.write(myDict)

            tweetCount += len(new_tweets)
            print("Downloaded {0} tweets".format(tweetCount))
            max_id = new_tweets[-1].id
        except tweepy.TweepError as e:
            # Just exit if any error
            print("some error : " + str(e))
            break
print("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))

Answer 1

tweepy API 參考api.search()提供了一些顏色：

請注意，Twitter 的搜索服務以及搜索 API並不是推文的詳盡來源。 並非所有推文都會被索引或通過搜索界面提供。

要直接回答您的問題，不可能從 API 獲得詳盡的推文列表（由於若干限制）。 但是，一些基於抓取的 Python 庫可用於解決這些 API 限制，例如 @taspinar 的twitterscraper 。

如何使用 tweepy 和 twitter REST API 獲取帶有特定主題標簽的每條推文

問題描述

1 個解決方案

解決方案1
1 2020-04-06 13:07:49

如何使用 tweepy 和 twitter REST API 獲取帶有特定主題標簽的每條推文

問題描述

1 個解決方案

解決方案1 1 2020-04-06 13:07:49

解決方案1
1 2020-04-06 13:07:49