使用tweepy获取独特的推文

Question

I am trying to get a corpus of Tweets using a number of search terms. 我试图使用一些搜索词来获取推文语料库。 One issue I am having is that it is not being able to get unique tweets. 我遇到的一个问题是它无法获得独特的推文。 That is, retweets. 转发。

Is there a way to remove these beforehand without doing any text processing? 有没有办法在不进行任何文本处理的情况下事先删除它们？

What I've got now: 我现在得到了什么：

 api=tweepy.API(auth)
 for search in hashtags:
     for tweet in  tweepy.Cursor(api.search,q=search,count=1000,lang="en").items(): 
         text=repr(tweet.text.encode("utf-8"))  
         out.write(text+"\n")

Answer 1

You can add " -filter:retweets" to your query to only get original tweets. 您可以在查询中添加“-filter：转推”以仅获取原始推文。 Maybe not the prettiest solution, but it works. 也许不是最漂亮的解决方案，但它确实有效。

api=tweepy.API(auth)
for search in hashtags:
    for tweet in  tweepy.Cursor(api.search,q=search+" -filter:retweets",count=1000,lang="en").items(): 
        text=repr(tweet.text.encode("utf-8"))  
        out.write(text+"\n")

使用tweepy获取独特的推文

问题描述

1 个解决方案

解决方案1
3 2016-12-12 09:40:08

使用tweepy获取独特的推文

问题描述

1 个解决方案

解决方案1 3 2016-12-12 09:40:08

解决方案1
3 2016-12-12 09:40:08