简体   繁体   中英

Accelerate S3 upload with paperclip

I'm using paperclip for uploading images in S3. But I've noted that this upload is very slow. I think because before complete the submit the file has to pass by my server, be processed and be sent to the S3 server.

Is there a method for accelerate this?

thanks

Do you want to improve the appearance of the upload being faster or actually make the upload faster?

If it's the former you can put your image handling logic into a background task using something like delayed_job . This way when a user clicks the button they'll immediately go to their next page while you process the image (you can show a "processing in progress" image placeholder until the task is finished).

If it's the latter then it's entirely down to your server and internet connection. Where are you hosting?

How about uploading direct to S3?

Not sure if paperclip does this out of the box, but you could make it.

http://docs.amazonwebservices.com/AmazonS3/2006-03-01/dev/index.html?UsingHTTPPOST.html

You did not post any code so I'm going to make a few assumptions here:

  • in your project you have an Album and Image model
  • An Album has_many :images
  • You already have paperclip and aws-sdk set up correctly with buckets and all else
  • You are uploading many images at once

In order to upload many images, your form will look something like this:

<%= form_for @album, html: { multipart: true } do |f| %>
  <%= f.file_field :files, accept: 'image/png,image/jpeg,image/gif', multiple: true %>

  <%= f.submit %>
<% end %>

Your controller will look something like this

class AlbumsController < ApplicationController
  def update
    @album = Album.find params[:id]
    @album.update album_params
    redirect_to @album, notice: 'Images saved'
  end

  def album_params
    params.require(:album).permit files: []
  end
end

In order to manipulate images using an album you'll need

class Album < ApplicationRecord
  has_many :images, dependent: :destroy

  accepts_nested_attributes_for :images, allow_destroy: true

  def files=(array = [])
    array.each do |f|
      images.create file: f
    end
  end
end

Your Image file will look like this

class Image < ApplicationRecord
  belongs_to :album

  has_attached_file :file, styles: { thumbnail: '500x500#' }, default_url: '/default.jpg'

  validates_attachment_content_type :file, content_type: /\Aimage\/.*\Z/
end

This is just the important stuff. With this setup, an upload of 22 images with a total of 12MB takes the :files= method 41.1806895 seconds to execute on average on my local server. To check how long a method takes to run, use:

def files=(array = [])
  start = Time.now

  array.each do |f|
    images.create file: f
  end

  p "ELAPSED TIME: #{Time.now - start}"
end

You ask for a faster upload of many images. There are a few ways to do this. Using jobs won't work because you can't pass complex data like images to a job.


Use delayed_paperclip instead. It moves image styles creation (like thumbnail: '500x500#' ) into background jobs.

Gemfile

source 'https://rubygems.org'

ruby '2.3.0'

...
gem 'delayed_paperclip'
...

Image file

class Image < ApplicationRecord
  ...
  process_in_background :file
end

It speeds up the :files= method. The same upload as before (22 images, 12MB) with this setup took 23.13998 seconds on my machine. That's 1.77963 times faster than before.


Another way of speeding things up is by using Threads . Remove delayed_paperclip from the Gemfile and the process_in_background :file line. Update your :files= method:

def files=(array = [])    
  threads = []

  array.each do |f|
    threads << Thread.new do
      images.create file: f
    end
  end

  threads.each(&:join)
end

You might try this, but get some weird error and only see that 4 images saved. You must also use Mutex . Also, you must not use :join on the threads because if you join, the method will wait until the threads are done running.

def files=(array = [])
  semaphore = Mutex.new

  array.each do |f|
    Thread.new do
      semaphore.synchronize do
        images.create file: f
      end
    end
  end
end

With this simple change to the method and no added gems, the same upload as before runs in 0.017628 seconds . That is 1,313 times faster than delayed_paperclip . It's also 2,336 times faster than the regular setup.


What happens if you use delayed_paperclip AND Threads ?

Don't change the :files= method. Just turn delayed_paperclip back on in your Gemfile and add back the process_in_background :file line.

With this setup on my machine, the method runs in 0.001277 seconds on average. That's

  • 13.8 times faster than Threads
  • 18,120.6 times faster than delayed_paperclip
  • 32,248.0 times faster than regular setup

Remember, this is on my machine and I have not tested this in production. I am also on wifi, not ethernet. All these things can change the results but I think the numbers speak for themselves.

Upload images faster. Done.


UPDATE: Don't use delayed_paperclip . It can cause a busy database, and some images might not get saved. I've tested it. I think just using threads is fast enough. Remove the process_in_background line from the Image file. Also, here's what my files= method looks like:

def files=(array = [])
  Thread.new do
    begin
      array.each { |f| images.create file: f }
    ensure
      ActiveRecord::Base.connection_pool.release_connection
    end
  end
end

Note: Since we push the image saving to a background task and then redirect. The page that loads will not have images on them yet. The user has to refresh to update the page. One way around this is to use polling . Polling is when JavaScript checks for any changes every 5 seconds or so and makes changes if any to the page.

Another option is to use Web Sockets . Now that we have Rails 5, we can use ActionCable . Every time an image gets created, we broadcast an update for the album. If the user is on that page for that album, they will see updates happen as soon as they happen on the database without having the user refresh or the browser make a request every 5 seconds on an infinite loop.

Cool stuff.

As cwninja recommends, we upload direct to s3 so as to get rid of this extra upload. We use a modified version of the plugin described in this blog post:

http://elctech.wpengine.com/2009/02/updates-on-rails-s3-flash-upload-plugin/

Ours is modified to handle multiple file uploads (rewrote the the flex object

Not sure how well this plays with paperclip, we use attachment_fu, but it wasn't so bad to get it to work with that.

Use delayed jobs, this is a good example here
Or you can use flash upload.

If you end up going the route of uploading directly to S3 which offloads the work from your Rails server, please check out my sample projects:

Sample project using Rails 3, Flash and MooTools-based FancyUploader to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-FancyUploader

Sample project using Rails 3, Flash/Silverlight/GoogleGears/BrowserPlus and jQuery-based Plupload to upload directly to S3: https://github.com/iwasrobbed/Rails3-S3-Uploader-Plupload

By the way, you can do post-processing with Paperclip using something like this blog post describes:

http://www.railstoolkit.com/posts/fancyupload-amazon-s3-uploader-with-paperclip

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM