I'm trying to make a function that splits binary file into chunks and uploads
class ChunksClient < ApiStruct::Client
# Takes the file, splits it into chunks and uploads each chunk into array of urls
# in corresponding order
def upload_chunks(big_file, array_of_urls)
chunk_size = 5242880
links.each do |link|
chunk = object.read(chunk_size)
upload_chunk(chunk, link)
end
end
def upload_chunk(chunk, link)
put(path: link, body: chunk, headers: { 'Content-type': 'application/octet-stream' })
end
end
However, doing it one chunk in a time is slow. So I tried to process them in parallel:
class ChunksClient < ApiStruct::Client
# Takes the file, splits it into chunks and uploads each chunk into array of urls
# in corresponding order
def upload_chunks(big_file, array_of_urls)
@chunk_size = 5242880
@index = 0
@object = object
threads = []
links.each do
threads << Thread.new do
chunk, index = take_chunk_with_index
upload_chunk(chunk, links[index])
end
end
threads.each(&:join)
end
private
def upload_chunk(chunk, link)
put(path: link, body: chunk, headers: { 'Content-type': 'application/octet-stream' })
end
def take_chunk_with_index
index = @index
chunk = @object.read(@chunk_size)
@index += 1
[chunk, index]
end
end
But it puts chunks into random links each time. I could just load the chunks into memory, but that way it would have trouble uploading big files (in gigabytes, for example)
Is there a correct way to process binary files with threads?
You should synchronize your take_chunk_with_index
method with Mutex
like so;
@mutex = Mutex.new
def take_chunk_with_index
@mutex.synchronize do
index = @index
chunk = @object.read(@chunk_size)
@index += 1
[chunk, index]
end
end
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.