简体   繁体   中英

How can I run my script remotely storing data in database?

I have a script that consumes tweets from twitter's streaming api into my localhost mongodb. To improve uptime, I would like to run this remotely, storing the tweets in a "cloud-like database", eg MongoLab.

Here is my script:

import json
import pymongo
import tweepy

consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)


class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, api):
        self.api = api
        super(tweepy.StreamListener, self).__init__()

        self.db = pymongo.MongoClient().test

    def on_data(self, tweet):
        self.db.tweets.insert(json.loads(tweet))

    def on_error(self, status_code):
        return True # Don't kill the stream

    def on_timeout(self):
        return True # Don't kill the stream


sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])

Now, I have set up accounts with MongoLab and Heroku but am completely stuck (I am new to all things programming). I suppose, moving things forward, I need to resolve two problems: i) how can I host my script with Heroku? ii) how can I point my script, running in Heroku, to my Mongolab account? Any thoughts?

Here's a guide to getting Python set up on Heroku:

https://devcenter.heroku.com/articles/python

And to connect your code to your MongoLab database, all you need to do is pass the URI to your MongoClient object. If you're using the MongoLab add-on through Heroku, the URI is bound for you in an environment variable:

https://devcenter.heroku.com/articles/mongolab#getting-your-connection-uri

You should be able to use os.getenv() to get it:

http://docs.python.org/2/library/os.html#os.getenv

Also, make sure you use the right database name (don't use "test"). The name of your database will appear at the end of the URI after the last slash '/'. In the end, you should end up with something like this:

self.db = pymongo.MongoClient(os.getenv("MONGOLAB_URI")).heroku_appXXXXXXX

Be aware that at this moment invoking Twitter API from Heroku can cause problems with Twitter IP address based rate limiting. Basically your application will share IP address with other Heroku applications that can also be sending requests to Twitter, and Twitter can blacklist the shared IP address. See these two questions for more details:

Twitter Rate Limits for Site hosted on Heroku

(twitter) Authentication failure! timeout: Net::OpenTimeout, execution expired

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM