简体   繁体   English

如何运行我的脚本远程存储数据库中的数据?

[英]How can I run my script remotely storing data in database?

I have a script that consumes tweets from twitter's streaming api into my localhost mongodb. 我有一个脚本,消费Twitter的流式api到我的localhost mongodb的推文。 To improve uptime, I would like to run this remotely, storing the tweets in a "cloud-like database", eg MongoLab. 为了提高正常运行时间,我想远程运行,将推文存储在“类似云的数据库”中,例如MongoLab。

Here is my script: 这是我的脚本:

import json
import pymongo
import tweepy

consumer_key = ""
consumer_secret = ""
access_key = ""
access_secret = ""

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_key, access_secret)
api = tweepy.API(auth)


class CustomStreamListener(tweepy.StreamListener):
    def __init__(self, api):
        self.api = api
        super(tweepy.StreamListener, self).__init__()

        self.db = pymongo.MongoClient().test

    def on_data(self, tweet):
        self.db.tweets.insert(json.loads(tweet))

    def on_error(self, status_code):
        return True # Don't kill the stream

    def on_timeout(self):
        return True # Don't kill the stream


sapi = tweepy.streaming.Stream(auth, CustomStreamListener(api))
sapi.filter(track=['Gandolfini'])

Now, I have set up accounts with MongoLab and Heroku but am completely stuck (I am new to all things programming). 现在,我已经使用MongoLab和Heroku建立了帐户,但我完全陷入困境(我对所有编程都很陌生)。 I suppose, moving things forward, I need to resolve two problems: i) how can I host my script with Heroku? 我想,推动事情向前发展,我需要解决两个问题:i)如何使用Heroku托管我的脚本? ii) how can I point my script, running in Heroku, to my Mongolab account? ii)如何将在Heroku中运行的脚本指向我的Mongolab帐户? Any thoughts? 有什么想法吗?

Here's a guide to getting Python set up on Heroku: 这是在Heroku上设置Python的指南:

https://devcenter.heroku.com/articles/python https://devcenter.heroku.com/articles/python

And to connect your code to your MongoLab database, all you need to do is pass the URI to your MongoClient object. 要将代码连接到MongoLab数据库,您需要做的就是将URI传递给MongoClient对象。 If you're using the MongoLab add-on through Heroku, the URI is bound for you in an environment variable: 如果您通过Heroku使用MongoLab插件,则URI将在环境变量中绑定到您:

https://devcenter.heroku.com/articles/mongolab#getting-your-connection-uri https://devcenter.heroku.com/articles/mongolab#getting-your-connection-uri

You should be able to use os.getenv() to get it: 您应该能够使用os.getenv()来获取它:

http://docs.python.org/2/library/os.html#os.getenv http://docs.python.org/2/library/os.html#os.getenv

Also, make sure you use the right database name (don't use "test"). 另外,请确保使用正确的数据库名称(不要使用“test”)。 The name of your database will appear at the end of the URI after the last slash '/'. 数据库的名称将显示在最后一个斜杠“/”后面的URI的末尾。 In the end, you should end up with something like this: 最后,你应该得到这样的东西:

self.db = pymongo.MongoClient(os.getenv("MONGOLAB_URI")).heroku_appXXXXXXX

Be aware that at this moment invoking Twitter API from Heroku can cause problems with Twitter IP address based rate limiting. 请注意,此时从Heroku调用Twitter API会导致基于Twitter IP地址的速率限制问题。 Basically your application will share IP address with other Heroku applications that can also be sending requests to Twitter, and Twitter can blacklist the shared IP address. 基本上,您的应用程序将与其他Heroku应用程序共享IP地址,这些应用程序也可以向Twitter发送请求,Twitter可以将共享IP地址列入黑名单。 See these two questions for more details: 有关详细信息,请参阅这两个问题:

Twitter Rate Limits for Site hosted on Heroku 在Heroku上托管的网站的Twitter速率限制

(twitter) Authentication failure! (twitter)身份验证失败! timeout: Net::OpenTimeout, execution expired timeout:Net :: OpenTimeout,执行过期

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM