I am using the Twitter Academic Research V2 API and want to get tweets from a list of users and store them in a dataframe. My code works for one single user, but not for a list of users. See the code here:
import tweepy
from twitter_authentication import bearer_token
import time
import pandas as pd
import time
client = tweepy.Client(bearer_token, wait_on_rate_limit=True)
# list of twitter users
csu = ["Markus_Soeder", "DoroBaer", "andreasscheuer"]
csu_tweets = []
for politician in csu:
for response in tweepy.Paginator(client.search_all_tweets,
query = f'from:{politician} -is:retweet lang:de',
user_fields = ['username', 'public_metrics', 'description', 'location'],
tweet_fields = ['created_at', 'geo', 'public_metrics', 'text'],
expansions = 'author_id',
start_time = '2022-12-01T00:00:00Z',
end_time = '2022-12-03T00:00:00Z'):
time.sleep(1)
csu_tweets.append(response)
end = time.time()
print(f"Scraping of {csu} needed {(end - start)/60} minutes.")
result = []
user_dict = {}
# Loop through each response object
for response in csu_tweets:
# Take all of the users, and put them into a dictionary of dictionaries with the info we want to keep
for user in response.includes['users']:
user_dict[user.id] = {'username': user.username,
'followers': user.public_metrics['followers_count'],
'tweets': user.public_metrics['tweet_count'],
'description': user.description,
'location': user.location
}
for tweet in response.data:
# For each tweet, find the author's information
author_info = user_dict[tweet.author_id]
# Put all of the information we want to keep in a single dictionary for each tweet
result.append({'author_id': tweet.author_id,
'username': author_info['username'],
'author_followers': author_info['followers'],
'author_tweets': author_info['tweets'],
'author_description': author_info['description'],
'author_location': author_info['location'],
'text': tweet.text,
'created_at': tweet.created_at,
'quote_count': tweet.public_metrics['quote_count'],
'retweets': tweet.public_metrics['retweet_count'],
'replies': tweet.public_metrics['reply_count'],
'likes': tweet.public_metrics['like_count'],
})
# Change this list of dictionaries into a dataframe
df = pd.DataFrame(result)
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_25716/2249018491.py in <module>
4 for response in csu_tweets:
5 # Take all of the users, and put them into a dictionary of dictionaries with the info we want to keep
----> 6 for user in response.includes['users']:
7 user_dict[user.id] = {'username': user.username,
8 'followers': user.public_metrics['followers_count'],
KeyError: 'users'
So I get this KeyError: 'users'. I don't get the error if I just scrape tweet from a single user and replace "csu = ["Markus_Soeder", "DoroBaer", "andreasscheuer"] with "csu = "Markus_Soeder". Does anyone know what could be the issue?
Thanks in advance!
I found the answer to this issue. It gave me the key error, because for some users in this time range there were no tweets and as a result it got stored as "none" in the response. Because of this the for loop didn't work.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.