简体   繁体   English

如何使用递归就地反转 Rust 字符串?

[英]How to reverse a Rust String in-place using recursion?

When reversing a String using recursion, I found it difficult to proceed parts of the String to the next because a slice is the &str type not a String .当使用递归反转String时,我发现很难将String的一部分继续到下一个,因为 slice 是&str类型而不是String

This doesn't run:这不运行:

fn reverse_string(s: &mut String) {
    if s.is_empty() {
        return;
    }
    // how to pass a correct parameter?
    reverse_string(&mut s[1..]);
    s.push(s.chars().nth(0).unwrap());
    s.remove(0);
}
error[E0308]: mismatched types
 --> src/lib.rs:6:20
  |
6 |     reverse_string(&mut s[1..]);
  |                    ^^^^^^^^^^^ expected struct `String`, found `str`
  |
  = note: expected mutable reference `&mut String`
             found mutable reference `&mut str`

Rust strings are UTF-8, which means that Rust 字符串为 UTF-8,这意味着

  1. A codepoint doesn't have a fixed-length代码点没有固定长度
  2. There's no one definition of what unit should be swapped.没有一个关于应该交换什么单位的定义。

If you want to only swap the characters of an ASCII string , this works:如果您只想交换ASCII 字符串的字符,则可以这样做:

use std::mem;

fn reverse_string_ascii(s: &mut str) {
    if !s.is_ascii() {
        return;
    }

    // Safety: We have checked that the string is ASCII,
    // so it's fine to treat it as a slice of bytes.
    unsafe {
        fn rev(b: &mut [u8]) {
            match b {
                [] => {}
                [_] => {}
                [h, rest @ .., t] => {
                    mem::swap(h, t);
                    rev(rest)
                }
            }
        }

        rev(s.as_bytes_mut());
    }
}

fn main() {
    let mut s = String::from("hello");
    reverse_string_ascii(&mut s);
    println!("{}", s);
}

There's no real reason to use recursion here though, iteration is better:虽然没有真正的理由在这里使用递归,但迭代更好:

let mut todo = s.as_bytes_mut();
loop {
    match todo {
        [] => break,
        [_] => break,
        [h, rest @ .., t] => {
            mem::swap(h, t);
            todo = rest;
        }
    }
}

See also:也可以看看:

Slices of the String datatype are of the datatype &str , therefore your program does not compile and the compiler also states that he expected a String but got a str . String数据类型的切片属于数据类型&str ,因此您的程序无法编译,编译器还声明他期望String但得到了str You could either try to convert the datatype but then you might have to write your program differently.您可以尝试转换数据类型,但您可能必须以不同的方式编写程序。

I am not sure why you are trying to do it with recursion but I'm sure you have a reason for that;)我不确定您为什么要尝试使用递归来做到这一点,但我敢肯定您有这样做的理由;)

I made a working demonstration of how I would naively do it without working with slices but only String and char :我做了一个工作演示,展示了如何在不使用切片但仅使用Stringchar的情况下天真地做到这一点:

fn rec_rev_str(mut s: String) -> String {
    if s.is_empty() {
        s
    } else {
        let removed_char = s.remove(0);
        let mut s = rec_rev_str(s);
        s.push(removed_char);
        s
    }
}

fn main() {
    let s = String::from("A test String :)");
    println!("{}", rec_rev_str(s));
}

I would write more effective version than suggested above.我会写出比上面建议的更有效的版本。 My version has O(n) complexity.我的版本具有 O(n) 复杂度。 It doesn't allocate if string is ASCII but still need it if string is unicode.如果字符串是 ASCII,它不会分配,但如果字符串是 unicode,它仍然需要它。

There are possibility for other improvements though, for example, you can rid allocation if all chars in string have same length in utf8 form (but don't forget about alignment doing this).虽然还有其他改进的可能性,例如,如果字符串中的所有字符在 utf8 格式中具有相同的长度,您可以取消分配(但不要忘记 alignment 这样做)。

Also, I made function accept &mut str because it is better since it allows wider range of input (eg reverse only substring).另外,我让 function 接受&mut str因为它更好,因为它允许更广泛的输入(例如,仅反转子字符串)。

You can see how much things need to be considered when working with unicode in unsafe Rust in comments.您可以在评论中看到在不安全的 Rust 中使用 unicode 时需要考虑多少事情。

fn reverse_str(s: &mut str){
    
    fn reverse_slice<T: Copy>(slice: &mut [T]){
        let slice_len = slice.len();
        if slice_len < 2{
            return;
        }
        slice.swap(0, slice_len-1);
        reverse_slice(&mut slice[1..slice_len-1]);
    }
    
    if s.is_ascii(){
        // Simple case: can reverse inplace
        unsafe{
            // Safety: string is ASCII
            reverse_slice(s.as_bytes_mut());
        }
    }
    else{
        // complex case: we need to work with unicode
        // Need to allocate, unfortunately
        let mut chars: Vec<char> = s.chars().collect();
        reverse_slice(&mut chars);
        unsafe {
            // Safety: We write same chars -> we have same length
            // Safety: We write correct UTF8 symbol by symbol
            // Safety: There are not possible panics in this unsafe block
            let mut bytes = s.as_bytes_mut();
            for c in chars{
                let bytes_written = c.encode_utf8(bytes).len();
                bytes = &mut bytes[bytes_written..]
            }
        }
    }
}

fn main(){
    // ASCII
    let mut s = "Hello".to_string();
    reverse_str(&mut s);
    println!("{}", s);
    
    // Unicode
    let mut s = "Авокадо 126".to_string();
    reverse_str(&mut s);
    println!("{}", s);
    
    // Substring
    let mut s = "Hello world".to_string();
    reverse_str(&mut s[6..]);
    println!("{}", s);
}

Output: Output:

olleH
621 одаковА
Hello dlrow

Also, LLVM successfully tail-optimizes this recursion: https://rust.godbolt.org/z/95sqfM此外,LLVM 成功地对该递归进行了尾部优化: https://rust.godbolt.org/z/95sqfM

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM