简体   繁体   中英

Is there any way to implement the Send trait for ZipFile?

I want to read a .zip file in a different thread by using the zip crate .

extern crate zip;

use zip::ZipArchive;
use zip::read::ZipFile;
use std::fs::File;
use std::io::BufReader;
use std::thread;

fn compute_hashes(mut file: ZipFile) {
    let reader_thread= thread::spawn(move || {
        let mut reader = BufReader::new(file);
        /* ... */
    });
}

fn main() {
    let mut file = File::open(r"facebook-JakubOnderka.zip").unwrap();
    let mut zip = ZipArchive::new(file).unwrap();

    for i in 0..zip.len() {
        let mut inside = zip.by_index(i).unwrap();

        if !inside.name().ends_with("/") { // skip directories
            println!("Filename: {}", inside.name());
            compute_hashes(inside);
        }
    }
}

But the compiler shows me this error:

error[E0277]: the trait bound `std::io::Read: std::marker::Send` is not satisfied
  --> src/main.rs:10:24
   |
10 |     let reader_thread= thread::spawn(move || {
   |                        ^^^^^^^^^^^^^ `std::io::Read` cannot be sent between threads safely
   |
   = help: the trait `std::marker::Send` is not implemented for `std::io::Read`
   = note: required because of the requirements on the impl of `std::marker::Send` for `&mut std::io::Read`
   = note: required because it appears within the type `std::io::Take<&mut std::io::Read>`
   = note: required because it appears within the type `zip::crc32::Crc32Reader<std::io::Take<&mut std::io::Read>>`
   = note: required because it appears within the type `zip::read::ZipFileReader<'_>`
   = note: required because it appears within the type `zip::read::ZipFile<'_>`
   = note: required because it appears within the type `[closure@src/main.rs:10:38: 13:6 file:zip::read::ZipFile<'_>]`
   = note: required by `std::thread::spawn`

But the same works for the type std::fs::File . Is it necessary to fix the zip crate or is there any other method?

This is a limitation of the zip crate's API and you can't really change anything. The problem is that the file ZipArchive is created by calling new and passing a reader -- something that implements Read and Seek . But these are the only requirements for the reader (in particular, the reader doesn't need to be Clone ). Thus, the whole ZipArchive can only own one reader.

But now the ZipArchive is able to produce ZipFile s which implement Read themselves. How does that work if the whole ZipArchive only has one reader? Through sharing! The only reader is shared between the archive and all files. But this sharing is not thread save! A mutable reference to the reader is stored in each ZipFile -- this violates Rust's core principle.

This is a known issue of the crate and is being discussed on the GitHub issue tracker .


So what can you do now? Not a whole lot, but a few possibilities (as mentioned by the library author) might be OK for your use case:

  • You could decompress the whole file into memory first, then send the raw data to another thread to do calculations on it. Something like:

     let data = Vec::new(); BufReader::new(file).read_to_end(&mut data)?; let reader_thread= thread::spawn(move || { // Do stuff with `data` }); 

    But if you just want to compute a cheap hash function on all files, loading the contents into memory is probably slower than computing the hash on the fly and might be infeasible if your files are big.

  • Creating one ZipArchive for each thread. This might be very slow if you have many small files in your archive...


A tiny hint: starting a thread costs time. You often don't want to start a thread for each unit of work, but rather maintain a fixed number of threads in a thread pool, manage work in a queue and assign work to idle worker threads. The threadpool crate might serve your needs.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM