简体   繁体   English

使用tweepy获取具有特定状态ID的推文

[英]get tweets with specific status ids using tweepy

I have a list of the specific status ids of tweets that I need to obtain. 我列出了需要获取的特定推文状态ID。 The tweepy documentation provides the following: tweepy文档提供以下内容:

 API.get_status(id)

Returns a single status specified by the ID parameter.
Parameters: id – The numerical ID of the status.
Return type:    Status object

I can't work out how to use this or find any examples. 我不知道如何使用它或找到任何示例。 Is this even the right thing? 这甚至是对的吗?

My list of ids is 2240 items long and looks something like this: 我的ID列表长2240项,看起来像这样:

response_ids = [717289507981107201, 717289501337509888, ..., 716684885411237888]

These ids were obtained from the 'in_response_to_status_id' field of tweets that I already have (I want to match the tweets I have to the tweets that they were written in response to). 这些ID是从我已经拥有的tweets的'in_response_to_status_id'字段中获得的(我想将我拥有的tweets与响应它们而编写的tweet相匹配)。

I basically want to write something like 我基本上想写类似

for id in response_ids:
    tweet = API.get_status(id)

Any help on how to do this, or advice about whether this is possible, much appreciated. 非常感谢您提供有关如何执行此操作的帮助,或有关是否可行的建议。

It is better to use the 'statuses_lookup' command. 最好使用“ statuses_lookup”命令。 More infor in the below link http://docs.tweepy.org/en/v3.5.0/api.html#API.statuses_lookup 以下链接中的更多信息http://docs.tweepy.org/en/v3.5.0/api.html#API.statuses_lookup

Before running the below program, get the consumer key and tokens. 在运行下面的程序之前,获取使用者密钥和令牌。

import tweepy
consumer_key = xxxx
consumer_secret = xxxx
access_token = xxxx
access_token_secret = xxxx

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

tweets = api.statuses_lookup(id_list) # id_list is the list of tweet ids
tweet_txt = []
for i in tweets:
    tweet_txt.append(i.text)

Think I've worked it out. 认为我已经解决了。

get_status does seem to be the right thing to use, although I initially had some problems with pagination errors. 尽管我最初在分页错误方面遇到了一些问题,但是get_status似乎确实是正确的用法。 I've hacked some code found in response to another similar problem to come up with this solution: 为了解决另一个类似的问题 ,我破解了一些代码来解决此问题:

def paginate(iterable, page_size):
    while True:
        i1, i2 = itertools.tee(iterable)
        iterable, page = (itertools.islice(i1, page_size, None),
                list(itertools.islice(i2, page_size)))
        if len(page) == 0:
            break
        yield page

index = 0
for page in paginate(response_ids, 1):
    result = api.get_status(response_ids[index])._json
    index += 1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM