简体   繁体   English

如何在 Rust 中将字符串拆分为块以插入空格

[英]How to split string into chunks in Rust to insert spaces

I'm attempting to learn Rust.我正在尝试学习 Rust。 And a recent problem I've encountered is the following: given a String , that is exactly some multiple of n, I want to split the string into chunks of size n, and insert a space in between these chunks, then collect back into a single string.我最近遇到的一个问题如下:给定一个String ,它恰好是 n 的某个倍数,我想将字符串拆分为大小为 n 的块,并在这些块之间插入一个空格,然后收集回一个单个字符串。

The issue I was running into, is that the chars() method returns the Chars struct, which for some reason doesn't implement the SliceConcatExt trait, so chunks() can't be called on it.我遇到的问题是chars()方法返回Chars结构,由于某种原因它没有实现SliceConcatExt特征,因此无法调用chunks()

Furthermore, once I've successfully created a Chunks struct (by calling .bytes() instead) I'm unsure how to call a .join(' ') since the elements are now Chunks of byte slices...此外,一旦我成功创建了一个 Chunks 结构(通过调用.bytes()代替),我不确定如何调用.join(' ')因为元素现在是字节切片的Chunks ......

There has to be an elegant way to do this I'm missing.必须有一种优雅的方式来做到这一点,我错过了。

For example here is an input / output that illustrates the situation:例如,这是一个说明情况的输入/输出:

given: whatupmyname, 4
output: what upmy name

This is my poorly written attempt:这是我写得很糟糕的尝试:

let n = 4;
let text = "whatupmyname".into_string();
text.chars()
    // compiler error on chunks() call
    .chunks(n)
    .collect::<Vec<String>>()
    .join(' ')

The problem here is that chars() and bytes() return Iterator s, not slices.这里的问题是chars()bytes()返回Iterator s,而不是切片。 You could use as_bytes() , which will give you a &[u8] .你可以使用as_bytes() ,它会给你一个&[u8] However, you cannot directly get a &[char] from a &str , because there only exists the bytes themselves, and the char s must be created by looking through and seeing how many bytes makes up each one.但是,您不能直接从&str中获取&[char] ,因为只存在字节本身,并且char必须通过查看并查看每个字节组成的字节数来创建。 You'd have to do something like this:你必须做这样的事情:

text.chars()
    .collect::<Vec<char>>()
    .chunks(n)
    .map(|c| c.iter().collect::<String>())
    .collect::<Vec<String>>()
    .join(" ");

However, I would NOT recommend this as it has to allocate a lot of temporary storage for Vec s and String s along the way.但是,我不建议这样做,因为它必须在此过程中为VecString分配大量临时存储空间。 Instead, you could do something like this, which only has to allocate to create the final String .相反,你可以做这样的事情,它只需要分配来创建最终的String

text.chars()
    .enumerate()
    .flat_map(|(i, c)| {
        if i != 0 && i % n == 0 {
            Some(' ')
        } else {
            None
        }
        .into_iter()
        .chain(std::iter::once(c))
    })
    .collect::<String>()

This stays as iterators until the last collect, by flat_mapping with an iterator that is either just the character or a space and then the character.在最后一次收集之前,它一直作为迭代器,通过 flat_mapping 使用一个迭代器,该迭代器要么只是字符,要么是空格,然后是字符。

So if you want to work from a list of chars to create a String, you can use fold for that.所以如果你想从一个字符列表中创建一个字符串,你可以使用fold

Something like this :像这样的东西:

text.chars
    .enumerate()
    .fold(String::new(), |acc, (i, c)| {
        if i != 0 && i == n {
            format!("{} {}", acc, c)
        } else {
            format!("{}{}", acc, c)
        }
    })

If the size of the data you want to split in is fixed then:如果要拆分的数据大小是固定的,则:

use std::str;

fn main() {
    let subs = "&#8204;&#8203;&#8204;&#8203;&#8204;&#8203;&#8203;&#8204;&#8203;&#8204;".as_bytes()
        .chunks(7)
        .map(str::from_utf8)
        .collect::<Result<Vec<&str>, _>>()
        .unwrap();
        
    println!("{:?}", subs);
}

// >> ["&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8203;", "&#8204;", "&#8203;", "&#8204;"]

Such a simple task can be solved with one while loop:这样一个简单的任务可以用一个while循环来解决:

fn main() {
    let n = 4;
    let text = "whatupmyname";
    let mut it = text.chars().enumerate();
    let mut rslt = String::from("");

    while let Some((i, c)) = it.next() {
        rslt.push(c);
        if (i + 1) % n == 0 {
            rslt.push(' ');
        }
    } 
    println!("{}", rslt);  
}

For the sake of completeness here is the loop in Chayim-Friedman-style (see comment) :为了完整起见,这里是 Chayim-Friedman 风格的循环(见评论)

    …
    for (i, c) in text.chars().enumerate() {
        rslt.push(c);
        if (i + 1) % n == 0 {
            rslt.push(' ');
        }
    }
    …

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM