I'm developing a Rails app which gives pricing data on various products by scraping prices from 3rd party sites (similar to http://railscasts.com/episodes/190-screen-scraping-with-nokogiri ).
Since I'm new to programming, right now I am manually doing this by putting my code in a rake task. The tasks loops through all the products in my database and updates their price through scraping. It takes a few hours to complete (since there are 1000s of products) but most of the time spent is from calling sleep so I can rate limit myself. Right now I'm calling the rake task manually from command line but I'd like to have a weekly periodic job that automatically runs in the background.
After a bit of research, it seems like there are several ways to do this ( Resque, DelayedJob, Cron/Whenever ) but I'm not sure which would best fit my need. In addition, I'm deploying through Heroku so I want to make sure I don't waste money on worker dynos; right now this is just a side project so I wouldn't want to spend that much.
What would be a simple and cost efficient way to do this?
I'm currently using the Heroku Scheduler . It can run tasks every day, every hour or every 10 minutes. It's extremely easy to use:
heroku addons:add scheduler:standard
rake name_of_your_task
), the frequency and the next run. And done. There are, however, several problems:
You need to give a valid credit card to be able to use this add-on even though it is, in principle, free.
The Scheduler runs one-off processes that will count toward your dyno-hours.
Heroku only gives you 750 free dyno hours per app.
This is what the Scheduler's wiki has to say about Long-running jobs :
Scheduled jobs are meant to execute short running tasks or enqueue longer running tasks into a background job queue. Anything that takes longer than a couple of minutes to complete should use a worker dyno to run.
So my advice here would be:
Break down your rake task into smaller chunks meant to run only for a couple minutes.
Run these tasks more periodically (you don't even have a weekly option using the Scheduler).
Keep an eye on your dyno hours. You can do so here . 750 hours amount to 31 days and 6 hours. So you have at least 6 hours to work with in those 31-day months. If your app is not being used, you can also use the following command to turn it off so it stops counting the regular dyno hours.
heroku ps:scale web=0
And you can scale it back up with
heroku ps:scale web=1
Unfortunately, there's no such thing as free computing power.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.