简体   繁体   中英

How to aggregate tweets of a given query within an hourly time period using Twitter Search API?

Ok, so I am using the Tweepy interface for the Twitter API for aggregating tweets of a particular query term for the purpose of Sentiment Analysis on it in real time. My objective is to search tweets per hour of each day for the past 7 days on the given query term and analyze how sentiment has varied over time. Each search request returns 100 tweets.

As I understand, the Twitter API provides since and until attributes to specify in the search query where two different dates can be entered and tweets are fetched within the given dates. However, it doesn't seem to work with any other time periods (like hours or minutes). Is there any way the latter can be done?

Bonus Question: During a search, 75% of the tweets fetched are retweets of the same tweet. I have to remove the duplicate tweets after fetching them all by checking the retweeted_status attribute of each tweet. Is there any provision in the API that removes the retweets in the server side itself before fetching them so I get more relevant data?

To the bonus question, yes you can filter retweets at the API level as per the Twitter API documentation https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators
Simple add it to your query before you pass to the Cursor.

query="search_this -filter:retweets"

Relevant StackOverflow question
Tweepy - Exclude Retweets

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM