简体   繁体   中英

Extract date from tweets (Tweepy, Python)

I'm new to Python, and so I'm struggling a bit with this. Basically, the code below gets the text of tweets with the hashtag bitcoin in it, and I want to extract the date and author as well as the text. I've tried different things, but stuck rn. Greatly appreciate any help with this.

import pandas as pd
import numpy as np
import tweepy

api_key = '*'
api_secret_key = '*'
access_token = '*'
access_token_secret = '*'

authentication = tweepy.OAuthHandler(consumer_key, consumer_secret_key)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(authentication, wait_on_rate_limit=True)

#Get tweets about Bitcoin and filter out any retweets
search_term = '#bitcoin -filter:retweets'
tweets = tweepy.Cursor(api.search_tweets, q=search_term, lang='en', since='2018-11-01', tweet_mode='extended').items(50)
all_tweets = [tweet.full_text for tweet in tweets]


df = pd.DataFrame(all_tweets, columns=['Tweets'])
df.head()

If you use dir(tweet) then you see all variables and functions in object tweet

author
contributors
coordinates
created_at
destroy
display_text_range
entities
extended_entities
favorite
favorite_count
favorited
full_text
geo
id
id_str
in_reply_to_screen_name
in_reply_to_status_id
in_reply_to_status_id_str
in_reply_to_user_id
in_reply_to_user_id_str
is_quote_status
lang
metadata
parse
parse_list
place
possibly_sensitive
retweet
retweet_count
retweeted
retweets
source
source_url
truncated
user

And there is created_at

all_tweets = []

for tweet in tweets:
    #print('\n'.join(dir(tweet)))
    all_tweets.append( [tweet.full_text, tweet.created_at] )

df = pd.DataFrame(all_tweets, columns=['Tweets', 'Created At'])
df.head()

Result:

                                           Tweets                Created At
0  @Ralvero Of course $KAWA ready for 100x 🚀#ETH ... 2022-03-26 13:51:06+00:00
1  Pairs:1INCHUSDT \n SELL:1.58500\n Time :3/26/2...  2022-03-26 13:51:06+00:00
2  @hotcrosscom @iSafePal 🌐 First LIVE Dapp: Cylu... 2022-03-26 13:51:04+00:00
3  @Justdoitalex @Isabel_Schnabel Finally a truth...  2022-03-26 13:51:03+00:00
4  #Bitcoin has rejected for the fourth time the ...  2022-03-26 13:50:55+00:00

But your code have problem with since because it seems it was removed in version 3.8

See: Collect tweets in a specific time period in Tweepy, until and since doesn't work

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM