简体   繁体   中英

How to parallel process binary files in Ruby?

I'm trying to make a function that splits binary file into chunks and uploads

class ChunksClient < ApiStruct::Client
  # Takes the file, splits it into chunks and uploads each chunk into array of urls
  # in corresponding order
  def upload_chunks(big_file, array_of_urls)
    chunk_size = 5242880
    links.each do |link|
      chunk = object.read(chunk_size)
      upload_chunk(chunk, link)
    end
  end

  def upload_chunk(chunk, link)
    put(path: link, body: chunk, headers: { 'Content-type': 'application/octet-stream' })
  end
end

However, doing it one chunk in a time is slow. So I tried to process them in parallel:

class ChunksClient < ApiStruct::Client
  # Takes the file, splits it into chunks and uploads each chunk into array of urls
  # in corresponding order
  def upload_chunks(big_file, array_of_urls)
    @chunk_size = 5242880
    @index = 0
    @object = object
    threads = []
    links.each do
      threads << Thread.new do
        chunk, index = take_chunk_with_index
        upload_chunk(chunk, links[index])
      end
    end
    threads.each(&:join)
  end

  private

  def upload_chunk(chunk, link)
    put(path: link, body: chunk, headers: { 'Content-type': 'application/octet-stream' })
  end

  def take_chunk_with_index
    index = @index
    chunk = @object.read(@chunk_size)
    @index += 1
    [chunk, index]
  end
end

But it puts chunks into random links each time. I could just load the chunks into memory, but that way it would have trouble uploading big files (in gigabytes, for example)

Is there a correct way to process binary files with threads?

You should synchronize your take_chunk_with_index method with Mutex like so;

@mutex = Mutex.new

def take_chunk_with_index
  @mutex.synchronize do
    index = @index
    chunk = @object.read(@chunk_size)
    @index += 1
    [chunk, index]
  end
end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM