简体   繁体   English

Tweepy(Twitter API)不返回所有搜索结果

[英]Tweepy (Twitter API) Not Returning all Search Results

I'm using the search feature with Tweepy for Twitter and for some reason the search results are limited to 15. Here is my code 我正在使用Twitter的Tweepy搜索功能,出于某种原因搜索结果限制为15.这是我的代码

results=api.search(q="Football",rpp=1000)

for result in results:
    print "%s" %(clNormalizeString(result.text))

print len(results)

and only 15 results are returned. 只返回15个结果。 Does it have something to do with different pages of results or something? 它与不同的结果页面有什么关系吗?

The question is more about Twitter API instead of tweepy itself. 问题更多的是关于Twitter API而不是tweepy本身。

According to the documentation , count parameter defines: 根据文档count参数定义:

The number of tweets to return per page, up to a maximum of 100. Defaults to 15. This was formerly the "rpp" parameter in the old Search API. 每页返回的推文数量,最多为100个。默认为15.这以前是旧版Search API中的“rpp”参数。

FYI, you can use tweepy.Cursor to get paginated results, like this: 仅供参考,您可以使用tweepy.Cursor获取分页结果,如下所示:

import tweepy


auth = tweepy.OAuthHandler(..., ...)
auth.set_access_token(..., ...)

api = tweepy.API(auth)
for tweet in tweepy.Cursor(api.search,
                           q="google",
                           count=100,
                           result_type="recent",
                           include_entities=True,
                           lang="en").items():
    print tweet.created_at, tweet.text

See also: https://github.com/tweepy/tweepy/issues/197 . 另见: https//github.com/tweepy/tweepy/issues/197

Hope that helps. 希望有所帮助。

Here's a minimal working example (once you replace the fake keys with real ones). 这是一个最小的工作示例(一旦你用真实的钥匙替换假钥匙)。

import tweepy
from math import ceil

def get_authorization():

    info = {"consumer_key": "A7055154EEFAKE31BD4E4F3B01F679",
            "consumer_secret": "C8578274816FAEBEB3B5054447B6046F34B41F52",
            "access_token": "15225728-3TtzidHIj6HCLBsaKX7fNpuEUGWHHmQJGeF",
            "access_secret": "61E3D5BD2E1341FFD235DF58B9E2FC2C22BADAD0"}

    auth = tweepy.OAuthHandler(info['consumer_key'], info['consumer_secret'])
    auth.set_access_token(info['access_token'], info['access_secret'])
    return auth


def get_tweets(query, n):
    _max_queries = 100  # arbitrarily chosen value
    api = tweepy.API(get_authorization())

    tweets = tweet_batch = api.search(q=query, count=n)
    ct = 1
    while len(tweets) < n and ct < _max_queries:
        print(len(tweets))
        tweet_batch = api.search(q=query, 
                                 count=n - len(tweets),
                                 max_id=tweet_batch.max_id)
        tweets.extend(tweet_batch)
        ct += 1
    return tweets

Note: I did try using a for loop, but the twitter api sometimes returns fewer than 100 results (despite being asked for 100, and 100 being available). 注意:我确实尝试使用for循环,但是twitter api有时会返回少于100个结果(尽管被要求100,并且100可用)。 I'm not sure why this is, but that's the reason why I didn't include a check to break the loop if tweet_batch is empty -- you may want to add such a check yourself as there is a query rate limit . 我不确定为什么会这样,但这就是为什么我没有包括检查来打破循环,如果tweet_batch为空 - 你可能想要自己添加这样的检查,因为有一个查询速率限制

Another Note: You can avoid hitting the rate limit by invoking wait_on_rate_limit=True like so 另请注意:您可以通过调用wait_on_rate_limit=True来避免达到速率限制

        api = tweepy.API(get_authorization(), wait_on_rate_limit=True)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM