[英]Rust - Collecting slices of a Vec in a recursive function
I am currently trying to build a huffman encoding program and am struggling with a problem I have while traversing my generated huffman tree to create a lookup table.我目前正在尝试构建一个霍夫曼编码程序,并且在遍历生成的霍夫曼树以创建查找表时遇到了一个问题。 I decided to implement said traversal with a recursive function.
我决定使用递归 function 来实现上述遍历。 In the actual implementation I use the bitvec crate to save bitsequences, but for simplicitly I will use
Vec<bool>
in this post.在实际实现中,我使用bitvec crate 来保存位序列,但为了简单起见,我将在这篇文章中使用
Vec<bool>
。
The idea I had was to save a collection of all codewords in the Vec
codewords
and then only save a slice out of that vector for the actual lookup table, for which I used a HashMap
.我的想法是将所有代码字的集合保存在
Vec
codewords
中,然后仅从该向量中保存一个切片用于实际查找表,为此我使用了HashMap
。
The issue is how exactly I would solve adding a 0 or a 1 for both the left and right traversal.问题是我将如何解决为左右遍历添加 0 或 1 的问题。 My idea here was to save a clone of a slice of the current sequence, append a 0 to
codewords
, then append that clone to the end of codewords
after traversing to the left so that I can push a 1 and traverse to the right.我的想法是保存当前序列片段的克隆, append a 0 到
codewords
,然后 append 在向左遍历后克隆到代码codewords
的末尾,这样我就可以按下 1 并向右遍历。 The function I came up with looks like this:我想出的 function 看起来像这样:
use std::collections::HashMap;
// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
pub val: u8,
pub freq: usize,
pub left: i16,
pub right: i16,
}
fn traverse_tree<'a>(
cur_index: usize,
height: i16,
codewords: &'a mut Vec<bool>,
lookup_table: &mut HashMap<u8, &'a [bool]>,
huffman_tree: &[HufTreeNode],
) {
let cur_node = &huffman_tree[cur_index];
// if the left child is -1, we reached a leaf
if cur_node.left == -1 {
// the last `height` bits in codewords
let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
lookup_table.insert(cur_node.val, cur_sequence);
return;
}
// save the current sequence so we can traverse to the right afterwards
let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
codewords.push(false);
traverse_tree(
cur_node.left as usize,
height + 1,
codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
lookup_table,
huffman_tree,
);
// append the previously saved current sequence
codewords.append(&mut cur_sequence); // second mutable borrow occurs here
codewords.push(true); // third mutable borrow occurs here
traverse_tree(
cur_node.right as usize,
height + 1,
codewords, // fourth mutable borrow occurs here
lookup_table,
huffman_tree,
);
}
fn main() {
// ...
}
Apparently there is an issue with lifetimes and borrowing in that snippet of code, and I kind of get what the problem is.显然,在那段代码中存在生命周期和借用问题,我有点明白问题所在。 From what I understand, when I give
codewords
as a parameter in the recursive call, it has to borrow the vector for as long as I save the slice in lookup_table
which is obviously not possible, causing the error.据我了解,当我在递归调用中将
codewords
作为参数提供时,只要我将切片保存在lookup_table
中,它就必须借用向量,这显然是不可能的,从而导致错误。 How do I solve this?我该如何解决这个问题?
This is what cargo check
gives me:这是
cargo check
给我的:
error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
--> untitled.rs:43:5
|
14 | fn traverse_tree<'a>(
| -- lifetime `'a` defined here
...
34 | / traverse_tree(
35 | | cur_node.left as usize,
36 | | height + 1,
37 | | codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
| | --------- first mutable borrow occurs here
38 | | lookup_table,
39 | | huffman_tree,
40 | | );
| |_____- argument requires that `*codewords` is borrowed for `'a`
...
43 | codewords.append(&mut cur_sequence); // second mutable borrow occurs here
| ^^^^^^^^^ second mutable borrow occurs here
error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
--> untitled.rs:44:5
|
14 | fn traverse_tree<'a>(
| -- lifetime `'a` defined here
...
34 | / traverse_tree(
35 | | cur_node.left as usize,
36 | | height + 1,
37 | | codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
| | --------- first mutable borrow occurs here
38 | | lookup_table,
39 | | huffman_tree,
40 | | );
| |_____- argument requires that `*codewords` is borrowed for `'a`
...
44 | codewords.push(true); // third mutable borrow occurs here
| ^^^^^^^^^ second mutable borrow occurs here
error[E0499]: cannot borrow `*codewords` as mutable more than once at a time
--> untitled.rs:48:9
|
14 | fn traverse_tree<'a>(
| -- lifetime `'a` defined here
...
34 | / traverse_tree(
35 | | cur_node.left as usize,
36 | | height + 1,
37 | | codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
| | --------- first mutable borrow occurs here
38 | | lookup_table,
39 | | huffman_tree,
40 | | );
| |_____- argument requires that `*codewords` is borrowed for `'a`
...
48 | codewords, // fourth mutable borrow occurs here
| ^^^^^^^^^ second mutable borrow occurs here
What am I missing here?我在这里想念什么? Is there some magical function in the vector API that I'm missing, and why exactly does this create lifetime issues in the first place?
我缺少的向量 API 中是否有一些神奇的 function ,为什么这首先会产生生命周期问题? From what I can tell, all my lifetimes are correct because
codewords
always lives for long enough for lookup_table
to save all those slices and I never mutably borrow something twice at the same time.据我所知,我的所有生命周期都是正确的,因为
codewords
的生存时间总是足够长,可以让lookup_table
保存所有这些切片,而且我从不可变地同时借用两次。 If there was something wrong with my lifetimes, the compiler would complain inside the if cur_node.left == -1
block, and the cur_sequence
I take after it is an owned Vec
, so there can't be any borrowing issues with that.如果我的生命周期有问题,编译器会在
if cur_node.left == -1
块内抱怨,而我在拥有Vec
之后使用的cur_sequence
,因此不会有任何借用问题。 So the issue really is with the core idea of having a recursive function with a mutable reference as a parameter.所以问题确实在于具有递归 function 的核心思想,其中可变引用作为参数。
Is there any way for me to solve this?我有什么办法可以解决这个问题吗? I tried making
codewords
owned and returning it, but then the compiler cannot ensure that the bitsequence I'm saving inside lookup_table
lives for long enough.我尝试让
codewords
拥有并返回它,但是编译器无法确保我保存在lookup_table
中的位序列存在足够长的时间。 The only idea I still have is to save owned vectors inside lookup_table
, but at that point the codewords
vector is obselete in the first place and I can simply implement this by having a cur_sequence
vector as parameter which I clone in every call, but I chose my approach for a better cache performance in the actual encoding process right after, which I would then lose.我唯一的想法是将拥有的向量保存在
lookup_table
中,但此时codewords
向量首先是过时的,我可以通过将cur_sequence
向量作为参数来简单地实现这一点,我在每次调用中都克隆它,但我选择了我的方法是在之后的实际编码过程中获得更好的缓存性能,然后我会丢失。
The problem is that when you create a slice cur_sequence
from codewords
like you did in let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
问题是,当您像在
let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
中所做的那样从codewords
创建切片cur_sequence
时, the compiler extends the lifetime of the reference to codewords
to at least the same as cur_sequence
(why: The compiler wants to ensure that the slice cur_sequence
is always valid, but if you change codewords
(say, clear it) then it's possible that cur_sequence
is invalid. By keeping an immutable reference to codewords
, then borrow rules will forbid modification of codewords
when the slice is still alive). ,编译器将对
codewords
的引用的生命周期延长到至少与cur_sequence
相同(为什么:编译器希望确保切片cur_sequence
始终有效,但如果您更改codewords
(例如,清除它),那么cur_sequence
可能无效。通过保持对codewords
的不可变引用,借用规则将禁止在 slice 仍然存在时修改codewords
)。 And unfortunately you save cur_sequence
in lookup_table
, thus keeping the reference to codewords
alive all over the function, so you cannot mutably borrow codewords
anymore.不幸的是,您将
cur_sequence
保存在lookup_table
中,从而在整个 function 中保持对codewords
的引用,因此您不能再可变地借用codewords
。
The solution is to maintain the indexes of the slice by yourself: create a struct:解决办法是自己维护切片的索引:创建一个结构体:
struct Range {
start: usize,
end: usize
}
impl Range {
fn new(start: usize, end: usize) -> Self {
Range{ start, end}
}
}
then use it instead of the slices:然后用它代替切片:
let cur_range = Range::new(
codewords.len() - 1 - height as usize,
codewords.len() - 1
);
lookup_table.insert(cur_node.val, cur_range);
In this way, the responsibility to keep the ranges valid is yours.这样,保持范围有效的责任就是你的了。
complete code:完整代码:
use std::collections::HashMap;
// ignore everything being public, I use getters in the real code
pub struct HufTreeNode {
pub val: u8,
pub freq: usize,
pub left: i16,
pub right: i16,
}
struct Range {
start: usize,
end: usize
}
impl Range {
fn new(start: usize, end: usize) -> Self {
Range{ start, end}
}
}
fn traverse_tree(
cur_index: usize,
height: i16,
codewords: &mut Vec<bool>,
lookup_table: &mut HashMap<u8, Range>,
huffman_tree: &[HufTreeNode],
) {
let cur_node = &huffman_tree[cur_index];
// if the left child is -1, we reached a leaf
if cur_node.left == -1 {
// the last `height` bits in codewords
// let cur_sequence = &codewords[(codewords.len() - 1 - height as usize)..];
let cur_range = Range::new(
codewords.len() - 1 - height as usize,
codewords.len() - 1
);
lookup_table.insert(cur_node.val, cur_range);
return;
}
// save the current sequence so we can traverse to the right afterwards
let mut cur_sequence = codewords[(codewords.len() - 1 - height as usize)..].to_vec();
codewords.push(false);
traverse_tree(
cur_node.left as usize,
height + 1,
codewords, // mutable borrow - argument requires that `*codewords` is borrowed for `'a`
lookup_table,
huffman_tree,
);
// append the previously saved current sequence
codewords.append(&mut cur_sequence); // second mutable borrow occurs here
codewords.push(true); // third mutable borrow occurs here
traverse_tree(
cur_node.right as usize,
height + 1,
codewords, // fourth mutable borrow occurs here
lookup_table,
huffman_tree,
);
}
fn main() {
// ...
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.