简体   繁体   English

Rust - 在递归 function 中收集 Vec 的切片

[英]Rust - Collecting slices of a Vec in a recursive function

I am currently trying to build a huffman encoding program and am struggling with a problem I have while traversing my generated huffman tree to create a lookup table.我目前正在尝试构建一个霍夫曼编码程序,并且在遍历生成的霍夫曼树以创建查找表时遇到了一个问题。 I decided to implement said traversal with a recursive function.我决定使用递归 function 来实现上述遍历。 In the actual implementation I use the bitvec crate to save bitsequences, but for simplicitly I will use Vec<bool> in this post.在实际实现中,我使用bitvec crate 来保存位序列,但为了简单起见,我将在这篇文章中使用Vec<bool>

The idea I had was to save a collection of all codewords in the Vec codewords and then only save a slice out of that vector for the actual lookup table, for which I used a HashMap .我的想法是将所有代码字的集合保存在Vec codewords中,然后仅从该向量中保存一个切片用于实际查找表,为此我使用了HashMap

The issue is how exactly I would solve adding a 0 or a 1 for both the left and right traversal.问题是我将如何解决为左右遍历添加 0 或 1 的问题。 My idea here was to save a clone of a slice of the current sequence, append a 0 to codewords , then append that clone to the end of codewords after traversing to the left so that I can push a 1 and traverse to the right.我的想法是保存当前序列片段的克隆, append a 0 到codewords ,然后 append 在向左遍历后克隆到代码codewords的末尾,这样我就可以按下 1 并向右遍历。 The function I came up with looks like this:我想出的 function 看起来像这样:

use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

fn traverse_tree<'a>(
    cur_index: usize,
    height: i16,
    codewords: &'a mut Vec<bool>,
    lookup_table: &mut HashMap<u8, &'a [bool]>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        lookup_table.insert(cur_node.val, cur_sequence);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}

Apparently there is an issue with lifetimes and borrowing in that snippet of code, and I kind of get what the problem is.显然,在那段代码中存在生命周期和借用问题,我有点明白问题所在。 From what I understand, when I give codewords as a parameter in the recursive call, it has to borrow the vector for as long as I save the slice in lookup_table which is obviously not possible, causing the error.据我了解,当我在递归调用中将codewords作为参数提供时,只要我将切片保存在lookup_table中,它就必须借用向量,这显然是不可能的,从而导致错误。 How do I solve this?我该如何解决这个问题?

This is what cargo check gives me:这是cargo check给我的:

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:43:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
43 |       codewords.append(&mut cur_sequence); // second mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:44:5
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
44 |       codewords.push(true); // third mutable borrow occurs here
   |       ^^^^^^^^^ second mutable borrow occurs here

error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
  --> untitled.rs:48:9
   |
14 |   fn traverse_tree<'a>(
   |                    -- lifetime `'a` defined here
...
34 | /     traverse_tree(
35 | |         cur_node.left as usize,
36 | |         height + 1,
37 | |         codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
   | |         --------- first mutable borrow occurs here
38 | |         lookup_table,
39 | |         huffman_tree,
40 | |     );
   | |_____- argument requires that `*codewords` is borrowed for `'a`
...
48 |           codewords, // fourth mutable borrow occurs here
   |           ^^^^^^^^^ second mutable borrow occurs here

What am I missing here?我在这里想念什么? Is there some magical function in the vector API that I'm missing, and why exactly does this create lifetime issues in the first place?我缺少的向量 API 中是否有一些神奇的 function ,为什么这首先会产生生命周期问题? From what I can tell, all my lifetimes are correct because codewords always lives for long enough for lookup_table to save all those slices and I never mutably borrow something twice at the same time.据我所知,我的所有生命周期都是正确的,因为codewords的生存时间总是足够长,可以让lookup_table保存所有这些切片,而且我从不可变地同时借用两次。 If there was something wrong with my lifetimes, the compiler would complain inside the if cur_node.left == -1 block, and the cur_sequence I take after it is an owned Vec , so there can't be any borrowing issues with that.如果我的生命周期有问题,编译器会在if cur_node.left == -1块内抱怨,而我在拥有Vec之后使用的cur_sequence ,因此不会有任何借用问题。 So the issue really is with the core idea of having a recursive function with a mutable reference as a parameter.所以问题确实在于具有递归 function 的核心思想,其中可变引用作为参数。

Is there any way for me to solve this?我有什么办法可以解决这个问题吗? I tried making codewords owned and returning it, but then the compiler cannot ensure that the bitsequence I'm saving inside lookup_table lives for long enough.我尝试让codewords拥有并返回它,但是编译器无法确保我保存在lookup_table中的位序列存在足够长的时间。 The only idea I still have is to save owned vectors inside lookup_table , but at that point the codewords vector is obselete in the first place and I can simply implement this by having a cur_sequence vector as parameter which I clone in every call, but I chose my approach for a better cache performance in the actual encoding process right after, which I would then lose.我唯一的想法是将拥有的向量保存在lookup_table中,但此时codewords向量首先是过时的,我可以通过将cur_sequence向量作为参数来简单地实现这一点,我在每次调用中都克隆它,但我选择了我的方法是在之后的实际编码过程中获得更好的缓存性能,然后我会丢失。

The problem is that when you create a slice cur_sequence from codewords like you did in let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];问题是,当您像在let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];中所做的那样从codewords创建切片cur_sequence, the compiler extends the lifetime of the reference to codewords to at least the same as cur_sequence (why: The compiler wants to ensure that the slice cur_sequence is always valid, but if you change codewords (say, clear it) then it's possible that cur_sequence is invalid. By keeping an immutable reference to codewords , then borrow rules will forbid modification of codewords when the slice is still alive). ,编译器将对codewords的引用的生命周期延长到至少与cur_sequence相同(为什么:编译器希望确保切片cur_sequence始终有效,但如果您更改codewords (例如,清除它),那么cur_sequence可能无效。通过保持对codewords的不可变引用,借用规则将禁止在 slice 仍然存在时修改codewords )。 And unfortunately you save cur_sequence in lookup_table , thus keeping the reference to codewords alive all over the function, so you cannot mutably borrow codewords anymore.不幸的是,您将cur_sequence保存在lookup_table中,从而在整个 function 中保持对codewords的引用,因此您不能再可变地借用codewords

The solution is to maintain the indexes of the slice by yourself: create a struct:解决办法是自己维护切片的索引:创建一个结构体:

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

then use it instead of the slices:然后用它代替切片:

let cur_range = Range::new(
    codewords.len() - 1 - height as usize,
    codewords.len() - 1
);
lookup_table.insert(cur_node.val, cur_range);

In this way, the responsibility to keep the ranges valid is yours.这样,保持范围有效的责任就是你的了。

complete code:完整代码:

use std::collections::HashMap;

// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
    pub val: u8,
    pub freq: usize,
    pub left: i16,
    pub right: i16,
}

struct Range {
    start: usize,
    end: usize
}

impl Range {
    fn new(start: usize, end: usize) -> Self {
        Range{ start, end}
    }
}

fn traverse_tree(
    cur_index: usize,
    height: i16,
    codewords: &mut Vec<bool>,
    lookup_table: &mut HashMap<u8, Range>,
    huffman_tree: &[HufTreeNode],
) {
    let cur_node = &huffman_tree[cur_index];

    // if the left child is -1, we reached a leaf
    if cur_node.left == -1 {
        // the last `height` bits in codewords
        // let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
        let cur_range = Range::new(
            codewords.len() - 1 - height as usize,
            codewords.len() - 1
        );
        lookup_table.insert(cur_node.val, cur_range);
        return;
    }

    // save the current sequence so we can traverse to the right afterwards
    let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
    codewords.push(false);
    traverse_tree(
        cur_node.left as usize,
        height + 1,
        codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
        lookup_table,
        huffman_tree,
    );

    // append the previously saved current sequence
    codewords.append(&mut cur_sequence); // second mutable borrow occurs here
    codewords.push(true); // third mutable borrow occurs here
    traverse_tree(
        cur_node.right as usize,
        height + 1,
        codewords, // fourth mutable borrow occurs here
        lookup_table,
        huffman_tree,
    );
}

fn main() {
    // ...
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM