简体   繁体   中英

How can I consume tweets from Twitter's streaming api and store them in mongodb

I need to develop an app that lets me track tweets and save them in a mongodb for a research project (as you might gather, I am a noob, so please bear with me). I have found this piece of code that sends tweets streaming through my terminal window:

import sys
import tweepy

consumer_key=""
consumer_secret=""
access_key = ""
access_secret = "" 


auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)

class CustomStreamListener(tweepy.StreamListener):
    def on_status(self, status):
        print status.text

    def on_error(self, status_code):
        print >> sys.stderr, 'Encountered error with status code:', status_code
        return True # Don't kill the stream

    def on_timeout(self):
        print >> sys.stderr, 'Timeout...'
        return True # Don't kill the stream

sapi = tweepy.streaming.Stream(auth, CustomStreamListener())
sapi.filter(track=['Gandolfini'])

Is there a way I can modify this piece of code so that instead of having tweets streaming over my screen, they are sent to my mongodb database?

Thanks

Here's an example:

import json
import pymongo
import tweepy

consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)


class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, api):
        self.api = api
        super(tweepy.StreamListener, self).__init__()

        self.db = pymongo.MongoClient().test

    def on_data(self, tweet):
        self.db.tweets.insert(json.loads(tweet))

    def on_error(self, status_code):
        return True # Don't kill the stream

    def on_timeout(self):
        return True # Don't kill the stream


sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])

This will write tweets to the mongodb test database, tweets collection.

Hope that helps.

I have developed a simple command line tool that does exactly this.

https://github.com/janezkranjc/twitter-tap

It allows using the streaming API or the search API.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM