I have a script that consumes tweets from twitter's streaming api into my localhost mongodb. To improve uptime, I would like to run this remotely, storing the tweets in a "cloud-like database", eg MongoLab.
Here is my script:
import json
import pymongo
import tweepy
consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)
class CustomStreamListener(tweepy.StreamListener):
def __init__(self, api):
self.api = api
super(tweepy.StreamListener, self).__init__()
self.db = pymongo.MongoClient().test
def on_data(self, tweet):
self.db.tweets.insert(json.loads(tweet))
def on_error(self, status_code):
return True # Don't kill the stream
def on_timeout(self):
return True # Don't kill the stream
sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])
Now, I have set up accounts with MongoLab and Heroku but am completely stuck (I am new to all things programming). I suppose, moving things forward, I need to resolve two problems: i) how can I host my script with Heroku? ii) how can I point my script, running in Heroku, to my Mongolab account? Any thoughts?
Here's a guide to getting Python set up on Heroku:
https://devcenter.heroku.com/articles/python
And to connect your code to your MongoLab database, all you need to do is pass the URI to your MongoClient object. If you're using the MongoLab add-on through Heroku, the URI is bound for you in an environment variable:
https://devcenter.heroku.com/articles/mongolab#getting-your-connection-uri
You should be able to use os.getenv() to get it:
http://docs.python.org/2/library/os.html#os.getenv
Also, make sure you use the right database name (don't use "test"). The name of your database will appear at the end of the URI after the last slash '/'. In the end, you should end up with something like this:
self.db = pymongo.MongoClient(os.getenv("MONGOLAB_URI")).heroku_appXXXXXXX
Be aware that at this moment invoking Twitter API from Heroku can cause problems with Twitter IP address based rate limiting. Basically your application will share IP address with other Heroku applications that can also be sending requests to Twitter, and Twitter can blacklist the shared IP address. See these two questions for more details:
Twitter Rate Limits for Site hosted on Heroku
(twitter) Authentication failure! timeout: Net::OpenTimeout, execution expired
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.