简体   繁体   English

Rust 字符串比较速度与 Python 相同。 想要并行化程序

[英]Rust string comparison same speed as Python . Want to parallelize the program

I am new to rust.我是 rust 的新手。 I want to write a function which later can be imported into Python as a module using the pyo3 crate.我想写一个 function 稍后可以使用 pyo3 板条箱作为模块导入 Python。

Below is the Python implementation of the function I want to implement in Rust:下面是我想在 Rust 中实现的 function 的 Python 实现:

def pcompare(a, b):
    letters = []
    for i, letter in enumerate(a):
        if letter != b[i]:
            letters.append(f'{letter}{i + 1}{b[i]}')
    return letters

The first Rust implemention I wrote looks like this:我写的第一个 Rust 实现如下所示:

use pyo3::prelude::*;


#[pyfunction]
fn compare_strings_to_vec(a: &str, b: &str) -> PyResult<Vec<String>> {

    if a.len() != b.len() {
        panic!(
            "Reads are not the same length! 
            First string is length {} and second string is length {}.",
            a.len(), b.len());
    }

    let a_vec: Vec<char> = a.chars().collect();
    let b_vec: Vec<char> = b.chars().collect();

    let mut mismatched_chars = Vec::new();

    for (mut index,(i,j)) in a_vec.iter().zip(b_vec.iter()).enumerate() {
        if i != j {
            index += 1;
            let mutation = format!("{i}{index}{j}");
            mismatched_chars.push(mutation);
        } 

    }
    Ok(mismatched_chars)
}


#[pymodule]
fn compare_strings(_py: Python<'_>, m: &PyModule) -> PyResult<()> {
    m.add_function(wrap_pyfunction!(compare_strings_to_vec, m)?)?;
    Ok(())
}

Which I builded in --release mode.我在--release模式下构建的。 The module could be imported to Python, but the performance was quite similar to the performance of the Python implementation.该模块可以导入到 Python,但性能与 Python 实现的性能非常相似。

My first question is: Why is the Python and Rust function similar in speed?我的第一个问题是:为什么 Python 和 Rust function 的速度相似?

Now I am working on a parallelization implementation in Rust.现在我正在 Rust 中实现并行化。 When just printing the result variable, the function works :当只打印结果变量时,function工作

use rayon::prelude::*;

fn main() {
    
    let a: Vec<char> = String::from("aaaa").chars().collect();
    let b: Vec<char> = String::from("aaab").chars().collect();
    let length = a.len();
    let index: Vec<_> = (1..=length).collect();
    
    let mut mismatched_chars: Vec<String> = Vec::new();
    
    (a, index, b).into_par_iter().for_each(|(x, i, y)| {
        if x != y {
            let mutation = format!("{}{}{}", x, i, y).to_string();
            println!("{mutation}");
            //mismatched_chars.push(mutation);
        }
    });
    
}

However, when I try to push the mutation variable to the mismatched_chars vector:但是,当我尝试将突变变量推送到mismatched_chars向量时:

use rayon::prelude::*;

fn main() {
    
    let a: Vec<char> = String::from("aaaa").chars().collect();
    let b: Vec<char> = String::from("aaab").chars().collect();
    let length = a.len();
    let index: Vec<_> = (1..=length).collect();
    
    let mut mismatched_chars: Vec<String> = Vec::new();
    
    (a, index, b).into_par_iter().for_each(|(x, i, y)| {
        if x != y {
            let mutation = format!("{}{}{}", x, i, y).to_string();
            //println!("{mutation}");
            mismatched_chars.push(mutation);
        }
    });
    
}

I get the following error:我收到以下错误:

error[E0596]: cannot borrow `mismatched_chars` as mutable, as it is a captured variable in a `Fn` closure
  --> src/main.rs:16:13
   |
16 |             mismatched_chars.push(mutation);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable

For more information about this error, try `rustc --explain E0596`.
error: could not compile `testing_compare_strings` due to previous error

I tried A LOT of different things.我尝试了很多不同的东西。 When I do:当我做:

use rayon::prelude::*;

fn main() {
    
    let a: Vec<char> = String::from("aaaa").chars().collect();
    let b: Vec<char> = String::from("aaab").chars().collect();
    let length = a.len();
    let index: Vec<_> = (1..=length).collect();
    
    let mut mismatched_chars: Vec<&str> = Vec::new();
    
    (a, index, b).into_par_iter().for_each(|(x, i, y)| {
        if x != y {
            let mutation = format!("{}{}{}", x, i, y).to_string();
            mismatched_chars.push(&mutation);
        }
    });
    
}

The error becomes:错误变为:

error[E0596]: cannot borrow `mismatched_chars` as mutable, as it is a captured variable in a `Fn` closure
  --> src/main.rs:16:13
   |
16 |             mismatched_chars.push(&mutation);
   |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ cannot borrow as mutable

error[E0597]: `mutation` does not live long enough
  --> src/main.rs:16:35
   |
10 |     let mut mismatched_chars: Vec<&str> = Vec::new();
   |         -------------------- lifetime `'1` appears in the type of `mismatched_chars`
...
16 |             mismatched_chars.push(&mutation);
   |             ----------------------^^^^^^^^^-
   |             |                     |
   |             |                     borrowed value does not live long enough
   |             argument requires that `mutation` is borrowed for `'1`
17 |         }
   |         - `mutation` dropped here while still borrowed

I suspect that the solution is quite simple, but I cannot see it myself.我怀疑解决方案很简单,但我自己看不到。

You have the right idea with what you are doing, but you will want to try to use an iterator chain with filter and map to remove or convert iterator items into different values.您对正在做的事情有正确的想法,但是您会想尝试使用带有filtermap的迭代器链来删除迭代器项或将迭代器项转换为不同的值。 Rayon also provides a collect method similar to regular iterators to convert items into a type T: FromIterator (such as Vec<T> ). Rayon还提供了一种类似于常规迭代器的collect方法,用于将项目转换为T: FromIterator (例如Vec<T> )。

fn compare_strings_to_vec(a: &str, b: &str) -> Vec<String> {
    // Same as with the if statement, but just a little shorter to write
    // Plus, it will print out the two values it is comparing if it errors.
    assert_eq!(a.len(), b.len(), "Reads are not the same length!");
    
    // Zip the character iterators from a and b together
    a.chars().zip(b.chars())
        // Iterate with the index of each item
        .enumerate()
        // Rayon function which turns a regular iterator into a parallel one 
        .par_bridge()
        // Filter out values where the characters are the same
        .filter(|(_, (a, b))| a != b)
        // Convert the remaining values into an error string
        .map(|(index, (a, b))| {
            format!("{}{}{}", a, index + 1, b)
        })
        // Turn the items of this iterator into a Vec (Or any other FromIterator type).
        .collect()
}

Rust Playground Rust游乐场

You cannot directly access the field mismatched_chars in a multithreading environment.您不能在多线程环境中直接访问mismatched_chars字段。

You can use Arc<RwLock> to access the field in multithreading.您可以使用Arc<RwLock>访问多线程中的字段。

use rayon::prelude::*;
use std::sync::{Arc, RwLock};

fn main() {
    let a: Vec<char> = String::from("aaaa").chars().collect();
    let b: Vec<char> = String::from("aaab").chars().collect();
    let length = a.len();
    let index: Vec<_> = (1..=length).collect();

    let mismatched_chars: Arc<RwLock<Vec<String>>> = Arc::new(RwLock::new(Vec::new()));

    (a, index, b).into_par_iter().for_each(|(x, i, y)| {
        if x != y {
            let mutation = format!("{}{}{}", x, i, y);
            mismatched_chars
                .write()
                .expect("could not acquire write lock")
                .push(mutation);
        }
    });

    for mismatch in mismatched_chars
        .read()
        .expect("could not acquire read lock")
        .iter()
    {
        eprintln!("{}", mismatch);
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM