有没有办法在 Rust 中更新字符串？

Question

您也可以将其视为，是否可以在 rust 中对字符串进行 URL 化？

例如，


Problem statement: Replace whitespace with %20
Assumption: String will have enough capacity left to accommodate new characters.

Input: Hello how are you

Output: Hello%20how%20are%20you

我知道如果我们不必“就地”这样做，有办法做到这一点。 我正在解决一个明确指出您必须就地更新的问题。

如果没有任何安全的方法可以做到这一点，背后是否有任何特殊原因？

[编辑]

我能够使用unsafe方法解决这个问题，但希望有比这更好的方法。 如果有的话，更惯用的方法。

fn space_20(sentence: &mut String) {
  if !sentence.is_ascii() {
    panic!("Invalid string");
  }

  let chars: Vec<usize> = sentence.char_indices().filter(|(_, ch)| ch.is_whitespace()).map(|(idx, _)| idx ).collect();
  let char_count = chars.len();
  if char_count == 0 {
    return;
  }

  let sentence_len = sentence.len();
  sentence.push_str(&"*".repeat(char_count*2)); // filling string with * so that bytes array becomes of required size.

  unsafe {
    let bytes = sentence.as_bytes_mut();

    let mut final_idx = sentence_len + (char_count * 2) - 1;

    let mut i = sentence_len - 1;
    let mut char_ptr = char_count - 1;
    loop {
      if i != chars[char_ptr] {
        bytes[final_idx] = bytes[i];
        if final_idx == 0 {
          // all elements are filled.
          println!("all elements are filled.");
          break;
        }
        final_idx -= 1;

      } else {
        bytes[final_idx] = '0' as u8;
        bytes[final_idx - 1] = '2' as u8;
        bytes[final_idx - 2] = '%' as u8;

        // final_idx is of type usize cannot be less than 0.
        if final_idx < 3 {
          println!("all elements are filled at start.");
          break;
        }

        final_idx -= 3;

        // char_ptr is of type usize cannot be less than 0.
        if char_ptr > 0 {
          char_ptr -= 1;
        }
      }

      if i == 0 {
        // all elements are parsed.
        println!("all elements are parsed.");
        break;
      }

      i -= 1;
    }

  }

}


fn main() {
  let mut sentence = String::with_capacity(1000);
  sentence.push_str("  hello, how are you?");
  // sentence.push_str("hello, how  are you?");
  // sentence.push_str(" hello, how are you? ");
  // sentence.push_str("  ");
  // sentence.push_str("abcd");

  space_20(&mut sentence);
  println!("{}", sentence);
}

Answer 1

使用std::mem::take的O ( n ) 解决方案，既不使用unsafe也不分配（前提是字符串具有足够的容量）：

fn urlify_spaces(text: &mut String) {
    const SPACE_REPLACEMENT: &[u8] = b"%20";

    // operating on bytes for simplicity
    let mut buffer = std::mem::take(text).into_bytes();
    let old_len = buffer.len();

    let space_count = buffer.iter().filter(|&&byte| byte == b' ').count();
    let new_len = buffer.len() + (SPACE_REPLACEMENT.len() - 1) * space_count;
    buffer.resize(new_len, b'\0');

    let mut write_pos = new_len;

    for read_pos in (0..old_len).rev() {
        let byte = buffer[read_pos];

        if byte == b' ' {
            write_pos -= SPACE_REPLACEMENT.len();
            buffer[write_pos..write_pos + SPACE_REPLACEMENT.len()]
                .copy_from_slice(SPACE_REPLACEMENT);
        } else {
            write_pos -= 1;
            buffer[write_pos] = byte;
        }
    }

    *text = String::from_utf8(buffer).expect("invalid UTF-8 during URL-ification");
}

（操场）

基本上，它计算字符串的最终长度，设置一个读指针和一个写指针，并将字符串从右向左翻译。 由于"%20"字符多于" " ，所以写指针永远赶不上读指针。

Answer 2

是否有可能在没有unsafe情况下做到这一点？

是这样的：

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert(*p, '0');
        my_string.insert(*p, '2');
        my_string.insert(*p, '%');
    }
    println!("{}", my_string);
}

这里是游乐场。

但是你应该这样做吗？

正如Reddit 上的示例所讨论的，这几乎总是不推荐的这样做的方法，因为remove和insert都是O(n)操作，如文档中所述。

编辑

稍微好一点的版本：

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert_str(*p, "%20");
    }
    println!("{}", my_string);
}

和相应的Playground 。

有没有办法在 Rust 中更新字符串？

问题描述

2 个解决方案

解决方案1
2 已采纳 2021-06-28 01:39:21

解决方案2
0 2021-06-27 17:09:33

编辑

有没有办法在 Rust 中更新字符串？

问题描述

2 个解决方案

解决方案1 2 已采纳 2021-06-28 01:39:21

解决方案2 0 2021-06-27 17:09:33

编辑

解决方案1
2 已采纳 2021-06-28 01:39:21

解决方案2
0 2021-06-27 17:09:33