Tweepy：使用 Paginator 提取媒体数据时出错

Question

My goal is to extract the media data from a tweet.我的目标是从推文中提取媒体数据。 I'm using twitter api-v2, and when I extract less than 100 tweets I have no problems, but when I use Paginator, I get an error telling me that我正在使用 twitter api-v2，当我提取少于 100 条推文时我没有问题，但是当我使用 Paginator 时，我收到一条错误消息告诉我

users = {u["id"]: u for u in tweets.includes['users']}
AttributeError: 'Paginator' object has no attribute 'includes'.

And I have not been able to change the code to extract the media data.而且我无法更改代码以提取媒体数据。 Also, I don't know if there is another way to have this data.另外，我不知道是否有其他方法可以获得这些数据。 Any help would be appreciated!任何帮助，将不胜感激！

client = tweepy.Client(bearer_token=(config.BEARER_TOKEN))

query = 'climate change -is:retweet has:media'

# your start and end time for fetching tweets
start_time = '2020-01-01T00:00:00Z'
end_time = '2020-01-31T00:00:00Z'

# get tweets from the API
tweets = tweepy.Paginator(client.search_all_tweets,
                          query=query,
                          start_time=start_time,
                          end_time=end_time,
                          tweet_fields=['context_annotations', 'created_at','source','public_metrics',
                                                'lang','referenced_tweets','reply_settings','conversation_id',
                                                'in_reply_to_user_id','geo'],
                          expansions=['attachments.media_keys','author_id','geo.place_id'],
                          media_fields=['preview_image_url','type','public_metrics','url'],
                          place_fields=['place_type', 'geo'],
                          user_fields=['name', 'username', 'location', 'verified', 'description',
                                               'profile_image_url','entities'],
                          max_results=100)

# Get users, media, place list from the includes object
users = {u["id"]: u for u in tweets.includes['users']}
media = {m["media_key"]: m for m in tweets.includes['media']}
# places = {p["id"]: p for p in tweets.includes['places']}

# create a list of records
tweet_info_ls = []
# iterate over each tweet and corresponding user details
for tweet in tweets.data:
    # metrics = tweet.organic_metrics
    # User Metadata
    user = users[tweet.author_id]
    # Media files
    attachments = tweet.data['attachments']
    media_keys = attachments['media_keys']
    link_image = media[media_keys[0]].preview_image_url
    url_image = media[media_keys[0]].url
    link_type = media[media_keys[0]].type
    link_public_metrics = media[media_keys[0]].public_metrics
    # Public metrics
    public_metrics = tweet.data['public_metrics']
    retweet_count = public_metrics['retweet_count']
    reply_count = public_metrics['reply_count']
    like_count = public_metrics['like_count']
    quote_count = public_metrics['quote_count']
    tweet_info = {
        'id': tweet.id,
        'author_id': tweet.author_id,
        'lang': tweet.lang,
        'geo': tweet.geo,
        # 'tweet_entities': metrics,
        'referenced_tweets': tweet.referenced_tweets,
        'reply_settings': tweet.reply_settings,
        'created_at': tweet.created_at,
        'text': tweet.text,
        'source': tweet.source,
        'retweet_count': retweet_count,
        'reply_count': reply_count,
        'like_count': like_count,
        'quote_count': quote_count,
        'name': user.name,
        'username': user.username,
        'location': user.location,
        'verified': user.verified,
        'description': user.description,
        'entities': user.entities,
        'profile_image': user.profile_image_url,
        'media_keys': link_image,
        'type': link_type,
        'link_public_metrics': link_public_metrics,
        'url_image': url_image
    }
    tweet_info_ls.append(tweet_info)

# create dataframe from the extracted records
df = pd.DataFrame(tweet_info_ls)

Answer 1

You have to iterate through the pages to get the includes (and the data by the way).您必须遍历页面以获取包含（以及数据）。

paginator = tweepy.Paginator(client.search_all_tweets, [...])

for page in paginator:

   print(page.data)      # The tweets in that page
   print(page.includes)  # The includes in that page

Tweepy：使用 Paginator 提取媒体数据时出错

问题描述

1 个解决方案

解决方案1
0 2022-08-02 11:12:30

Tweepy：使用 Paginator 提取媒体数据时出错

问题描述

1 个解决方案

解决方案1 0 2022-08-02 11:12:30

解决方案1
0 2022-08-02 11:12:30