简体   繁体   中英

How to split string into chunks in Rust to insert spaces

I'm attempting to learn Rust. And a recent problem I've encountered is the following: given a String , that is exactly some multiple of n, I want to split the string into chunks of size n, and insert a space in between these chunks, then collect back into a single string.

The issue I was running into, is that the chars() method returns the Chars struct, which for some reason doesn't implement the SliceConcatExt trait, so chunks() can't be called on it.

Furthermore, once I've successfully created a Chunks struct (by calling .bytes() instead) I'm unsure how to call a .join(' ') since the elements are now Chunks of byte slices...

There has to be an elegant way to do this I'm missing.

For example here is an input / output that illustrates the situation:

given: whatupmyname, 4
output: what upmy name

This is my poorly written attempt:

let n = 4;
let text = "whatupmyname".into_string();
text.chars()
    // compiler error on chunks() call
    .chunks(n)
    .collect::<Vec<String>>()
    .join(' ')

The problem here is that chars() and bytes() return Iterator s, not slices. You could use as_bytes() , which will give you a &[u8] . However, you cannot directly get a &[char] from a &str , because there only exists the bytes themselves, and the char s must be created by looking through and seeing how many bytes makes up each one. You'd have to do something like this:

text.chars()
    .collect::<Vec<char>>()
    .chunks(n)
    .map(|c| c.iter().collect::<String>())
    .collect::<Vec<String>>()
    .join(" ");

However, I would NOT recommend this as it has to allocate a lot of temporary storage for Vec s and String s along the way. Instead, you could do something like this, which only has to allocate to create the final String .

text.chars()
    .enumerate()
    .flat_map(|(i, c)| {
        if i != 0 && i % n == 0 {
            Some(' ')
        } else {
            None
        }
        .into_iter()
        .chain(std::iter::once(c))
    })
    .collect::<String>()

This stays as iterators until the last collect, by flat_mapping with an iterator that is either just the character or a space and then the character.

So if you want to work from a list of chars to create a String, you can use fold for that.

Something like this :

text.chars
    .enumerate()
    .fold(String::new(), |acc, (i, c)| {
        if i != 0 && i == n {
            format!("{} {}", acc, c)
        } else {
            format!("{}{}", acc, c)
        }
    })

If the size of the data you want to split in is fixed then:

use std::str;

fn main() {
    let subs = "&#8204;&#8203;&#8204;&#8203;&#8204;&#8203;&#8203;&#8204;&#8203;&#8204;".as_bytes()
        .chunks(7)
        .map(str::from_utf8)
        .collect::<Result<Vec<&str>, _>>()
        .unwrap();
        
    println!("{:?}", subs);
}

// >> ["&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8204;", "&#8203;", "&#8203;", "&#8204;", "&#8203;", "&#8204;"]

Such a simple task can be solved with one while loop:

fn main() {
    let n = 4;
    let text = "whatupmyname";
    let mut it = text.chars().enumerate();
    let mut rslt = String::from("");

    while let Some((i, c)) = it.next() {
        rslt.push(c);
        if (i + 1) % n == 0 {
            rslt.push(' ');
        }
    } 
    println!("{}", rslt);  
}

For the sake of completeness here is the loop in Chayim-Friedman-style (see comment) :

    …
    for (i, c) in text.chars().enumerate() {
        rslt.push(c);
        if (i + 1) % n == 0 {
            rslt.push(' ');
        }
    }
    …

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM