[英]Split a Vec<u8> matching a Vec<u8> pattern
let input = vec![
1, 1, 1,
98, 99,
2, 2, 2, 2,
98, 99,
3, 3
];
How can I split by [98, 99]
to get:如何除以
[98, 99]
以获得:
let output = vec![
vec![1, 1, 1],
vec![2, 2, 2, 2],
vec![3, 3],
];
For now I only found a way to split in 2 once, but I need to split as much as the pattern [98, 99]
is found:目前我只找到了一种拆分 2 的方法,但我需要拆分为找到模式
[98, 99]
的次数:
let sep = vec![98, 99];
let (a, b) = input.split_at(
res.windows(sep.len())
.position(|w| w == &sep)
.unwrap_or_default(),
);
Is there a way to do it via stdlib without manually looping through the collection?有没有一种方法可以通过 stdlib 来完成,而无需手动循环遍历集合?
As @BurntShushi5 suggests, use the bstr
crate:正如@BurntShushi5 建议的那样,使用
bstr
板条箱:
use bstr::ByteSlice;
fn main() {
let input: Vec<u8> = vec![1, 1, 1, 98, 99, 2, 2, 2, 2, 98, 99, 3, 3];
let pattern: [u8; 2] = [98, 99];
let result: Vec<Vec<u8>> = input.split_str(&pattern).map(|x|x.to_vec()).collect();
println!("{:?}", result);
}
If you are using ( utf-8
) bytes
this may do.如果您使用的是(
utf-8
) bytes
,这可能会起作用。 The idea is to use the split
method from str
:这个想法是使用
str
的split
方法:
fn main() {
let input: Vec<u8> = vec![1, 1, 1, 98, 99, 2, 2, 2, 2, 98, 99, 3, 3];
let s = std::str::from_utf8(&input).unwrap();
let pattern = std::str::from_utf8(&[98, 99]).unwrap();
let new_vector: Vec<Vec<u8>> = s.split(pattern).map(|s| s.as_bytes().to_vec()).collect();
println!("{:?}", new_vector);
}
AFAIK there is no split
function in the std lib for byte slices (or anything else except strings really). AFAIK 在标准库中没有用于字节片(或除字符串之外的任何其他内容)的
split
function 。 You can however let the stdlib handle the looping for you using fold
:但是,您可以让 stdlib 使用
fold
为您处理循环:
let input = vec![
1, 1, 1,
98, 99,
2, 2, 2, 2,
98, 99,
3, 3
];
let sep = vec![ 98, 99 ];
let acc = input.iter().fold (
(vec![], vec![], 0),
|(mut res, mut cur, count), &val| {
if val == sep[count] {
if count+1 == sep.len() {
res.push (cur);
(res, vec![], 0)
} else {
(res, cur, count+1)
}
} else if count != 0 {
cur.extend_from_slice (&sep[..count]);
(res, cur, 0)
} else {
cur.push (val);
(res, cur, 0)
}
});
let mut output = acc.0;
if !acc.1.is_empty() { output.push (acc.1); }
This can also be adapted to work with references, thus removing the need for the inputs to implement Copy
:这也可以适用于引用,从而无需输入来实现
Copy
:
let acc = input.iter().enumerate().fold (
(vec![], 0, 0, 0),
|(mut res, beg, end, count), (cur, val)| {
if val == &sep[count] {
if count+1 == sep.len() {
res.push (&input[beg..end]);
(res, cur+1, cur+1, 0)
} else {
(res, beg, end, count+1)
}
} else {
(res, beg, cur+1, 0)
}
});
let mut output = acc.0;
if acc.1 < acc.2 { output.push (&input[acc.1 .. acc.2]); }
Note that the output of this second version is a vector of slices instead of a vector of vectors.请注意,第二个版本的 output 是切片向量而不是向量向量。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.