簡體   English   中英

Tweepy Streaming過濾器字段

[英]Tweepy Streaming filter fields

我有這個python代碼,它使用Tweepy和Streming API從Twitter檢索數據,當找到1000個結果(即1000條tweets數據)時,它將停止。 它運作良好,但問題是當我嘗試在PyCharm上運行它時,它會削減部分結果。 由於代碼可能會返回推文的所有數據(ID,文本,作者等),因此它會生成太多數據,並且軟件會崩潰。 因此,我想修改代碼以便僅獲取Twitter數據的某些字段(例如,我只需要Twitter文本,作者,日期),可以提出任何建議

# Import the necessary package to process data in JSON format
try:
    import json
except ImportError:
    import simplejson as json

# Import the necessary methods from "twitter" library
from twitter import Twitter, OAuth, TwitterHTTPError, TwitterStream

# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN = ''
ACCESS_SECRET = ''
CONSUMER_KEY = ''
CONSUMER_SECRET = ''


oauth = OAuth(ACCESS_TOKEN, ACCESS_SECRET, CONSUMER_KEY, CONSUMER_SECRET)

# Initiate the connection to Twitter Streaming API
twitter_stream = TwitterStream(auth=oauth)

# Get a sample of the public data following through Twitter
#iterator = twitter_stream.statuses.sample() #SEMPLICE TWITTER STREAMING

iterator = twitter_stream.statuses.filter(track="Euro2016", language="en") #tWITTER STREAMING IN BASE AD UNA TRACK DI RICERCA E AL LINGUAGGIO PER ALTRI SETTAGGI VEDERE https://dev.twitter.com/streaming/overview/request-parameters
#PER SETTARE PARAMETRI RICERCA https://dev.twitter.com/streaming/overview/request-parameters


# Print each tweet in the stream to the screen
# Here we set it to stop after getting 1000 tweets.
# You don't have to set it to stop, but can continue running
# the Twitter API to collect data for days or even longer.
tweet_count = 1000 #SETTAGGIO DI QUANTI RISULTATI RESTITUIRE
for tweet in iterator:
    tweet_count -= 1
    # Twitter Python Tool wraps the data returned by Twitter
    # as a TwitterDictResponse object.
    # We convert it back to the JSON format to print/score
    print(json.dumps(tweet))

    # The command below will do pretty printing for JSON data, try it out
    # print json.dumps(tweet, indent=4)

    if tweet_count <= 0:
        break

我可以在PyCharm上運行此程序,而沒有任何問題,可以發送1000條推文。 因此,請嘗試在另一台計算機上運行它,或者調查現有系統是否存在問題。

結果是一個python字典,因此訪問單個元素所需的全部如下所示

for tweet in iterator:
    tweet_count -= 1
    #access the elements such as 'text','created_at' ... 
    print tweet['text']

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM