简体   繁体   English

Rails和Heroku上的定期后台工作

[英]Periodic background jobs on Rails & Heroku

I'm developing a Rails app which gives pricing data on various products by scraping prices from 3rd party sites (similar to http://railscasts.com/episodes/190-screen-scraping-with-nokogiri ). 我正在开发一个Rails应用程序,该应用程序通过从第三方网站抓取价格来提供各种产品的价格数据(类似于http://railscasts.com/episodes/190-screen-scraping-with-nokogiri )。

Since I'm new to programming, right now I am manually doing this by putting my code in a rake task. 由于我是编程新手,因此现在我通过将代码放入rake任务中来手动执行此操作。 The tasks loops through all the products in my database and updates their price through scraping. 这些任务遍历数据库中的所有产品,并通过抓取来更新其价格。 It takes a few hours to complete (since there are 1000s of products) but most of the time spent is from calling sleep so I can rate limit myself. 它需要几个小时才能完成(因为有1000多种产品),但是大部分时间都花在打电话给睡眠上,因此我可以限制自己的速度。 Right now I'm calling the rake task manually from command line but I'd like to have a weekly periodic job that automatically runs in the background. 现在,我正在从命令行手动调用rake任务,但我希望有一个每周定期的作业,该作业会在后台自动运行。

After a bit of research, it seems like there are several ways to do this ( Resque, DelayedJob, Cron/Whenever ) but I'm not sure which would best fit my need. 经过一些研究,似乎有几种方法可以做到这一点( Resque,DelayedJob,Cron / Whenever ),但是我不确定哪种方法最适合我的需求。 In addition, I'm deploying through Heroku so I want to make sure I don't waste money on worker dynos; 另外,我正在通过Heroku进行部署,因此我想确保自己不会浪费金钱在工人的测功机上。 right now this is just a side project so I wouldn't want to spend that much. 现在这只是一个附属项目,所以我不想花那么多钱。

What would be a simple and cost efficient way to do this? 一种简单且经济高效的方法可以做到这一点?

I'm currently using the Heroku Scheduler . 我目前正在使用Heroku Scheduler It can run tasks every day, every hour or every 10 minutes. 它可以每天,每小时或每10分钟运行任务。 It's extremely easy to use: 它非常易于使用:

  1. Install the add-on with heroku addons:add scheduler:standard 使用heroku addons:add scheduler:standard安装heroku addons:add scheduler:standard
  2. Go to your app in the Heroku website, select the Scheduler add-on and add a new job. 在Heroku网站上转到您的应用程序,选择Scheduler加载项并添加新作业。 You do this by defining the task ( rake name_of_your_task ), the frequency and the next run. 为此,您可以定义任务( rake name_of_your_task ),频率和下一次运行。 And done. 并做了。

There are, however, several problems: 但是,存在几个问题:

  1. You need to give a valid credit card to be able to use this add-on even though it is, in principle, free. 您需要提供有效的信用卡才能使用此加载项,即使从原则上讲它是免费的。

  2. The Scheduler runs one-off processes that will count toward your dyno-hours. 计划程序将运行一次性过程,该过程将计入您的动态小时。

  3. Heroku only gives you 750 free dyno hours per app. Heroku每个应用程序仅为您提供750个免费的动态小时。

This is what the Scheduler's wiki has to say about Long-running jobs : 这是调度程序的Wiki关于长期运行的工作的说法:

Scheduled jobs are meant to execute short running tasks or enqueue longer running tasks into a background job queue. 计划的作业旨在执行短期运行的任务或将长期运行的任务排入后台作业队列。 Anything that takes longer than a couple of minutes to complete should use a worker dyno to run. 任何需要花费超过几分钟时间才能完成的操作都应使用工人dyno来运行。

So my advice here would be: 所以我的建议是:

  1. Break down your rake task into smaller chunks meant to run only for a couple minutes. 将您的rake任务分解为几个小块,这些小块只能运行几分钟。

  2. Run these tasks more periodically (you don't even have a weekly option using the Scheduler). 定期运行这些任务(甚至没有使用Scheduler的每周选项)。

  3. Keep an eye on your dyno hours. 密切注意您的动态时间。 You can do so here . 您可以在这里进行 750 hours amount to 31 days and 6 hours. 750小时总计31天6个小时。 So you have at least 6 hours to work with in those 31-day months. 因此,在这31天的时间里,您至少有6个小时可以工作。 If your app is not being used, you can also use the following command to turn it off so it stops counting the regular dyno hours. 如果未使用您的应用程序,您还可以使用以下命令将其关闭,以使其不再计算常规的动态小时数。

     heroku ps:scale web=0 

    And you can scale it back up with 您可以将其扩展到

     heroku ps:scale web=1 

Unfortunately, there's no such thing as free computing power. 不幸的是,没有免费的计算能力。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM