简体   繁体   中英

How to Get Full Tweets in CSV Cells Using Tweepy.Cursor

I'm rather new to coding (have only used R for regression modeling) and am now learning Python for a research assistantship. My present task is to use for loops and tweepy.Cursor/API search to search tweets by a list of hashtags and to convert them to a dataframe and store the results in a CSV file.

I have managed to do so, but the tweets appear truncated in the cells of the CSV file after using this code (mostly inherited from a grad student to help me get started):

import tweepy as tw 
import pandas as pd
import numpy as np
import re

consumer_key = ""
consumer_secret = ""
atoken = ""
asecret = ""

auth = tw.OAuthHandler(consumer_key, consumer_secret) 
auth.set_access_token(atoken, asecret)
api = tw.API(auth, wait_on_rate_limit=True)

hashtag_list = open('hashtag_list.txt', "r")

tweets = []
appended_data = []
tw_all_hashtags = pd.DataFrame(columns = ["text", "hashtag"]) 

for hashtag in hashtag_list:
    hashtag = hashtag.replace('\n','') 
    try:
        for i in tw.Cursor(api.search, q = hashtag, lang = "en", twitter_mode = 'extended').items(25): 
            tweets.append(i)

        one_hashtag_df = pd.DataFrame(vars(tweets[i]) for i in range(len(tweets)))  
        one_hashtag_df.dropna(subset=['text'], inplace=True)
        one_hashtag_df.drop_duplicates(subset='text', keep="last")
        one_hashtag_df = one_hashtag_df.drop(one_hashtag_df.index[150:])
        one_hashtag_df["hashtag"] = hashtag
        tw_all_hashtags = tw_all_hashtags.append(one_hashtag_df[["text", "hashtag"]], ignore_index=True)
        tweets = [] 
    except:
      print("Temporary error. Please try again later.") 

      
for i in range(len(tw_all_hashtags)):
    x = tw_all_hashtags.iloc[i]['text']
    tw_all_hashtags.iloc[i]['text'] = ' '.join(
        re.sub("(@[A-Za-z0-9]+)|([^0-9A-Za-z \t])|(\w+:\/\/\S+)", " ", x).split()) 
tw_all_hashtags['text'] = tw_all_hashtags['text'].str.replace('RT', '')
tw_all_hashtags.reset_index(drop=True).to_csv("tweets_hashtag.csv", index=False)

As you'll see, I tried adding the argument twitter_mode = 'extended' to the tw.Cursor line, but this changed nothing in the final CSV File. I receive no errors but still only get cut off tweets when I view them on Excel. Any advice for a newbie on how to solve this little problem of mine? Thanks in advance. Cheers!

请改用tweet_mode = "extended"

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM