简体   繁体   English

拆分 Vec<u8> 匹配一个 Vec<u8> 图案</u8></u8>

[英]Split a Vec<u8> matching a Vec<u8> pattern

let input = vec![
    1, 1, 1,
    98, 99,
    2, 2, 2, 2,
    98, 99,
    3, 3
];

How can I split by [98, 99] to get:如何除以[98, 99]以获得:

let output = vec![
    vec![1, 1, 1],
    vec![2, 2, 2, 2],
    vec![3, 3],
];

For now I only found a way to split in 2 once, but I need to split as much as the pattern [98, 99] is found:目前我只找到了一种拆分 2 的方法,但我需要拆分为找到模式[98, 99]的次数:

let sep = vec![98, 99];
let (a, b) = input.split_at(
    res.windows(sep.len())
        .position(|w| w == &sep)
        .unwrap_or_default(),
);

Is there a way to do it via stdlib without manually looping through the collection?有没有一种方法可以通过 stdlib 来完成,而无需手动循环遍历集合?

As @BurntShushi5 suggests, use the bstr crate:正如@BurntShushi5 建议的那样,使用bstr板条箱:

use bstr::ByteSlice;

fn main() {
    let input: Vec<u8> = vec![1, 1, 1, 98, 99, 2, 2, 2, 2, 98, 99, 3, 3];
    let pattern: [u8; 2] = [98, 99];
    let result: Vec<Vec<u8>> = input.split_str(&pattern).map(|x|x.to_vec()).collect();
    println!("{:?}", result);
}

Playground 操场

If you are using ( utf-8 ) bytes this may do.如果您使用的是( utf-8bytes ,这可能会起作用。 The idea is to use the split method from str :这个想法是使用strsplit方法:

fn main() {
    let input: Vec<u8> = vec![1, 1, 1, 98, 99, 2, 2, 2, 2, 98, 99, 3, 3];

    let s = std::str::from_utf8(&input).unwrap();
    let pattern = std::str::from_utf8(&[98, 99]).unwrap();
    let new_vector: Vec<Vec<u8>> = s.split(pattern).map(|s| s.as_bytes().to_vec()).collect();
    println!("{:?}", new_vector);
}

Playground 操场

AFAIK there is no split function in the std lib for byte slices (or anything else except strings really). AFAIK 在标准库中没有用于字节片(或除字符串之外的任何其他内容)的split function 。 You can however let the stdlib handle the looping for you using fold :但是,您可以让 stdlib 使用fold为您处理循环:

let input = vec![
    1, 1, 1,
    98, 99,
    2, 2, 2, 2,
    98, 99,
    3, 3
];
let sep = vec![ 98, 99 ];
    
let acc = input.iter().fold (
    (vec![], vec![], 0),
    |(mut res, mut cur, count), &val| {
        if val == sep[count] {
            if count+1 == sep.len() {
                res.push (cur);
                (res, vec![], 0)
            } else {
                (res, cur, count+1)
            }
        } else if count != 0 {
            cur.extend_from_slice (&sep[..count]);
            (res, cur, 0)
        } else {
            cur.push (val);
            (res, cur, 0)
        }
    });
let mut output = acc.0;
if !acc.1.is_empty() { output.push (acc.1); }

Playground 操场

This can also be adapted to work with references, thus removing the need for the inputs to implement Copy :这也可以适用于引用,从而无需输入来实现Copy

let acc = input.iter().enumerate().fold (
    (vec![], 0, 0, 0),
    |(mut res, beg, end, count), (cur, val)| {
        if val == &sep[count] {
            if count+1 == sep.len() {
                res.push (&input[beg..end]);
                (res, cur+1, cur+1, 0)
            } else {
                (res, beg, end, count+1)
            }
        } else {
            (res, beg, cur+1, 0)
        }
    });
let mut output = acc.0;
if acc.1 < acc.2 { output.push (&input[acc.1 .. acc.2]); }

Playground 操场

Note that the output of this second version is a vector of slices instead of a vector of vectors.请注意,第二个版本的 output 是切片向量而不是向量向量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM