Have just installed whenever gem https://github.com/javan/whenever to run my rake tasks, which are nokogiri / feedzilla dependent scraping tasks.
eg my tasks are called grab_bbc, grab_guardian etc
My question - as I update my site, I keep add more tasks to scheduler.rake.
What should I write in my config/schedule.rb to make all rake tasks run, no matter what they are called?
Would something like this work?
every 12.hours do
rake:task.each do |task|
runner task
end
end
Am new to Cron, using RoR 4.
namespace :sc do
desc 'All'
task all: [:create_categories, :create_subcategories]
desc 'Create categories'
task create_categories: :environment do
# your code
end
desc 'Create subcategories'
task create_subcategories: :environment do
# your code
end
end
in console write $ rake sc:all
write separate rake tasks for each scraping tasks. then write a aggregated task to run all those scraping rake tasks.
desc "scrape nytimes"
task :scrape_nytimes do
# scraping method
end
desc "scrape guardian"
task :scrape_guardian do
# scraping method
end
desc "perform all scraping"
task :scrape do
Rake::Task[:scrape_nytimes].execute
Rake::Task[:scrape_guardian].execute
end
then call the rake task as
rake scrape
Make sure you have a unique namespace with all the tasks in it, like:
namespace :scrapers do
desc "Scraper Number 1"
task :scrape_me do
# Your code here
end
desc "Scraper Number 2"
task :scrape_it do
# Your code here
end
end
You could then run all tasks of that namespace with a task outside of that namespace:
task :run_all_scrapers do
Rake.application.tasks.each do |task|
task.invoke if task.name.starts_with?("scrapers:")
end
end
That said, I'm pretty sure that this is not how you should run a set of scrapers. If for any reason the if
part should return true you might unintenionally run tasks like rake db:drop
Either "manually" maintaining schedule.rb
or a master task seems like a better option to me.
The aggregated task can be concise:
namespace :scrape do
desc "scrape nytimes"
task :nytimes do
# scraping method
end
desc "scrape guardian"
task :guardian do
# scraping method
end
end
desc "perform all scraping"
task scrape: ['scrape:nytimes', 'scrape:guardian']
Namespaces are also a good practice.
namespace
and in_namespace
to run all tasks dynamically.I prefer this method because it keeps things clean and precludes you from having to remember to update your "parent" task if any of our namespace tasks change.
Note, the example was borrowed from Dmitry Shvetsov's excellent answer .
namespace :scrape do
desc "scrape nytimes"
task :nytimes do
# scraping method
end
desc "scrape guardian"
task :guardian do
# scraping method
end
end
desc "perform all scraping"
task :scrape do
Rake.application.in_namespace( :scrape ){ |namespace| namespace.tasks.each( &:invoke ) }
end
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.