I have an app where users will be uploading files directly to S3. This is working. Now, I need a background worker (presently delayed_job) to retrieve the file and stash it in 'tmp/files' for processing.
How can this be done?
Edits: The app is presently running in EC2.
Background workers will run independently from your web app.
Try Resque for a commonly used Rails background worker solution. The idea is that you start Resque up independently from your web app, and it does its jobs independently of the application.
Have this worker sent basic HTTP requests to S3. Here's an API reference card to get you started. The idea is that you use some sort of Ruby REST client to send these requests, and parse the response you get from S3. Rest-client is a gem you can use to do this.
Optionally, you can also have the worker use the S3 gem which could be a bit easier.
With this approach, you'd have your worker run a script that does something like
picture = S3Object.find 'headshot.jpg', 'photos'
Use Resque.
add
gem 'resque'
gem 'resque-status'
With Resque you need Redis (to store information about workers) either use Redis-to-go or install Redis locally on your EC2 machine.
So after Resque is installed edit config/initializers/resque.rb
rails_root = ENV['RAILS_ROOT'] || File.dirname(__FILE__) + '/../..'
rails_env = ENV['RAILS_ENV'] || 'production'
resque_config = YAML.load_file(rails_root + '/config/resque.yml')
Resque.redis = resque_config[rails_env]
# This is if you are using Redis to go:
# ENV["REDISTOGO_URL"] ||= "redis://REDISTOGOSTUFFGOESHERE"
# uri = URI.parse(ENV["REDISTOGO_URL"])
# Resque.redis = Redis.new(:host => uri.host, :port => uri.port, :password => uri.password, :thread_safe => true)
Resque::Plugins::Status::Hash.expire_in = (24 * 60 * 60) # 24hrs in seconds
Dir["#{Rails.root}/app/workers/*.rb"].each { |file| require file }
Here we are using local Redis, so resque.yml looks like this:
development: localhost:6379
test: localhost:6379
fi: localhost:6379
production: localhost:6379
You will need something like God to start/manage workers
So install it then add "resque-production.god" to config/ folder of your app You will be able to start your workers via this: god -c config/resque-production.god the config/resque-production.god file willl have something like:
rails_env = ENV['RAILS_ENV'] || "production"
rails_root = ENV['RAILS_ROOT'] || File.dirname(__FILE__) + '/..'
num_workers = 1
num_workers.times do |num|
God.watch do |w|
w.dir = "#{rails_root}"
w.name = "resque-#{num}"
w.group = 'resque'
w.interval = 30.seconds
w.env = {"QUEUE"=>"*", "RAILS_ENV"=>"production"}
w.start = "rake -f #{rails_root}/Rakefile environment resque:work --trace"
w.log = "#{rails_root}/log/resque.log"
w.err_log = "#{rails_root}/log/resque_error.log"
# restart if memory gets too high
w.transition(:up, :restart) do |on|
on.condition(:memory_usage) do |c|
c.above = 350.megabytes
c.times = 2
end
end
# determine the state on startup
w.transition(:init, { true => :up, false => :start }) do |on|
on.condition(:process_running) do |c|
c.running = true
end
end
# determine when process has finished starting
w.transition([:start, :restart], :up) do |on|
on.condition(:process_running) do |c|
c.running = true
c.interval = 5.seconds
end
# failsafe
on.condition(:tries) do |c|
c.times = 5
c.transition = :start
c.interval = 5.seconds
end
end
# start if process is not running
w.transition(:up, :start) do |on|
on.condition(:process_running) do |c|
c.running = false
end
end
end
end
Finally workers. They go into app/workers/ folder (here is app/workers/processor.rb)
class Processor
include Resque::Plugins::Status
@queue = :collect_queue
def perform
article_id = options["article_id"]
article = Article.find(article_id)
article.download_remote_file(article.file_url)
end
end
It is triggered by the callback in the Article model (app/models/article.rb)
class Article < ActiveRecord::Base
after_create :process
def download_remote_file(url)
# OpenURI extends Kernel.open to handle URLs as files
io = open(url)
# overrides Paperclip::Upfile#original_filename;
# we are creating a singleton method on specific object ('io')
def io.original_filename
base_uri.path.split('/').last
end
io.original_filename.blank? ? nil : io
end
def process
Processor.create(:article_id => self.id)
end
end
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.