简体   繁体   English

如何使用 tweepy 和 twitter REST API 获取带有特定主题标签的每条推文

[英]how to get every single tweets with a certain hashtag using tweepy and twitter REST API

for a data visualization project I need to gather all tweets (would it be possible at all?) with a certain hashtag.对于数据可视化项目,我需要收集所有带有特定标签的推文(这可能吗?)。 for this purpose I am using the code below.为此,我使用下面的代码。 it uses Tweepy and REST API.它使用 Tweepy 和 REST API。 However, it only downloads up to around 2500 tweets or less.但是,它最多只能下载大约 2500 条推文或更少。 I was wondering how I can fix this limitation.我想知道如何解决这个限制。 is there a pro subscription or anything else i should purchase or how should I modify the code.是否有专业订阅或我应该购买的其他任何东西,或者我应该如何修改代码。

#!/usr/bin/python
# -*- coding: utf-8 -*-
# this file is configured for rtl language and farsi characters

import sys
from key import *
import tweepy

#imported from the key.py file
API_KEY =KAPI_KEY
API_SECRET =KAPI_SECRET
OAUTH_TOKEN =KOAUTH_TOKEN
OAUTH_TOKEN_SECRET =KOAUTH_TOKEN_SECRET

auth = tweepy.AppAuthHandler(API_KEY, API_SECRET)

api = tweepy.API(auth, wait_on_rate_limit=True,
                 wait_on_rate_limit_notify=True)

if not api:
    print("Can't Authenticate")
    sys.exit(-1)

def write_unicode(text, charset='utf-8'):
    return text.encode(charset)

searchQuery = "#کرونا"  # this is what we're searching for
maxTweets = 100000  # Some arbitrary large number
tweetsPerQry = 100  # this is the max the API permits
fName = 'Corona-rest8.txt'  # We'll store the tweets in a text file.

sinceId = None

max_id = -1
tweetCount: int = 0
print("Downloading max {0} tweets".format(maxTweets))
with open(fName, "wb") as f:
    while tweetCount < maxTweets:
        try:
            if max_id <= 0:
                if not sinceId:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry)
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            since_id=sinceId)
            else:
                if not sinceId:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1))
                else:
                    new_tweets = api.search(q=searchQuery, count=tweetsPerQry,
                                            max_id=str(max_id - 1),
                                            since_id=sinceId)
            if not new_tweets:
                print("No more tweets found")
                break
            for tweet in new_tweets:

                #print(tweet._json["created_at"])
                if str(tweet._json["user"]["location"])!="":
                    print(tweet._json["user"]["location"])
                myDict = json.dumps(tweet._json["text"], ensure_ascii=False).encode('utf8')+ "\n".encode('ascii')
                f.write(myDict)

            tweetCount += len(new_tweets)
            print("Downloaded {0} tweets".format(tweetCount))
            max_id = new_tweets[-1].id
        except tweepy.TweepError as e:
            # Just exit if any error
            print("some error : " + str(e))
            break
print("Downloaded {0} tweets, Saved to {1}".format(tweetCount, fName))

The tweepy API Reference for api.search() provides a bit of color on this: tweepy API 参考api.search()提供了一些颜色:

Please note that Twitter's search service and, by extension, the Search API is not meant to be an exhaustive source of Tweets.请注意,Twitter 的搜索服务以及搜索 API并不是推文的详尽来源。 Not all Tweets will be indexed or made available via the search interface.并非所有推文都会被索引或通过搜索界面提供。

To answer your question directly, it is not possible to acquire an exhaustive list of tweets from the API (because of several limitations).要直接回答您的问题,不可能从 API 获得详尽的推文列表(由于若干限制)。 However, a few scraping-based Python libraries are available to work around these API limitations, like @taspinar's twitterscraper .但是,一些基于抓取的 Python 库可用于解决这些 API 限制,例如 @taspinar 的twitterscraper

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用Python Tweepy从带有特定标签的Twitter API挖掘7天的Tweets - Mining 7 days worth of Tweets from Twitter API with a certain hashtag using Python Tweepy 如何在tweepy中的某个位置获取特定主题标签的推文? - How to get tweets of a particular hashtag in a location in a tweepy? 如何使用 Tweepy 在特定日期获取推文? - How to get tweets on certain dates using Tweepy? 如何使用 Twitter 搜索 API 获取具有给定主题标签的所有推文? - How to fetch all tweets with a given hashtag using the Twitter search API? 如何使用 Twitter API 和 Tweepy 在 DateTime 之后获取所有推文 - How to get all tweets after a DateTime with Twitter API and Tweepy 如何使用 Tweepy 多次调用 Twitter API 以获取每个用户超过 200 条推文? - How to make multiple calls to Twitter API to get more than 200 Tweets per user using Tweepy? Twitter API - 不使用Tweepy收集所有推文 - Twitter API - not collecting all tweets using Tweepy 如何使用 Tweepy 通过带有语言和计数过滤器的主题标签流式传输推文? - How to stream Tweets by hashtag with language AND count filter using Tweepy? 如何使用 Tweepy API 从推文中获取 media_url - How to get media_url from tweets using the Tweepy API 如何在Python中使用Twitter API获得推文? - how to get tweets using twitter API in python?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM