有沒有辦法在 Rust 中更新字符串？

Question

您也可以將其視為，是否可以在 rust 中對字符串進行 URL 化？

例如，


Problem statement: Replace whitespace with %20
Assumption: String will have enough capacity left to accommodate new characters.

Input: Hello how are you

Output: Hello%20how%20are%20you

我知道如果我們不必“就地”這樣做，有辦法做到這一點。 我正在解決一個明確指出您必須就地更新的問題。

如果沒有任何安全的方法可以做到這一點，背后是否有任何特殊原因？

[編輯]

我能夠使用unsafe方法解決這個問題，但希望有比這更好的方法。 如果有的話，更慣用的方法。

fn space_20(sentence: &mut String) {
  if !sentence.is_ascii() {
    panic!("Invalid string");
  }

  let chars: Vec<usize> = sentence.char_indices().filter(|(_, ch)| ch.is_whitespace()).map(|(idx, _)| idx ).collect();
  let char_count = chars.len();
  if char_count == 0 {
    return;
  }

  let sentence_len = sentence.len();
  sentence.push_str(&"*".repeat(char_count*2)); // filling string with * so that bytes array becomes of required size.

  unsafe {
    let bytes = sentence.as_bytes_mut();

    let mut final_idx = sentence_len + (char_count * 2) - 1;

    let mut i = sentence_len - 1;
    let mut char_ptr = char_count - 1;
    loop {
      if i != chars[char_ptr] {
        bytes[final_idx] = bytes[i];
        if final_idx == 0 {
          // all elements are filled.
          println!("all elements are filled.");
          break;
        }
        final_idx -= 1;

      } else {
        bytes[final_idx] = '0' as u8;
        bytes[final_idx - 1] = '2' as u8;
        bytes[final_idx - 2] = '%' as u8;

        // final_idx is of type usize cannot be less than 0.
        if final_idx < 3 {
          println!("all elements are filled at start.");
          break;
        }

        final_idx -= 3;

        // char_ptr is of type usize cannot be less than 0.
        if char_ptr > 0 {
          char_ptr -= 1;
        }
      }

      if i == 0 {
        // all elements are parsed.
        println!("all elements are parsed.");
        break;
      }

      i -= 1;
    }

  }

}


fn main() {
  let mut sentence = String::with_capacity(1000);
  sentence.push_str("  hello, how are you?");
  // sentence.push_str("hello, how  are you?");
  // sentence.push_str(" hello, how are you? ");
  // sentence.push_str("  ");
  // sentence.push_str("abcd");

  space_20(&mut sentence);
  println!("{}", sentence);
}

Answer 1

使用std::mem::take的O ( n ) 解決方案，既不使用unsafe也不分配（前提是字符串具有足夠的容量）：

fn urlify_spaces(text: &mut String) {
    const SPACE_REPLACEMENT: &[u8] = b"%20";

    // operating on bytes for simplicity
    let mut buffer = std::mem::take(text).into_bytes();
    let old_len = buffer.len();

    let space_count = buffer.iter().filter(|&&byte| byte == b' ').count();
    let new_len = buffer.len() + (SPACE_REPLACEMENT.len() - 1) * space_count;
    buffer.resize(new_len, b'\0');

    let mut write_pos = new_len;

    for read_pos in (0..old_len).rev() {
        let byte = buffer[read_pos];

        if byte == b' ' {
            write_pos -= SPACE_REPLACEMENT.len();
            buffer[write_pos..write_pos + SPACE_REPLACEMENT.len()]
                .copy_from_slice(SPACE_REPLACEMENT);
        } else {
            write_pos -= 1;
            buffer[write_pos] = byte;
        }
    }

    *text = String::from_utf8(buffer).expect("invalid UTF-8 during URL-ification");
}

（操場）

基本上，它計算字符串的最終長度，設置一個讀指針和一個寫指針，並將字符串從右向左翻譯。 由於"%20"字符多於" " ，所以寫指針永遠趕不上讀指針。

Answer 2

是否有可能在沒有unsafe情況下做到這一點？

是這樣的：

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert(*p, '0');
        my_string.insert(*p, '2');
        my_string.insert(*p, '%');
    }
    println!("{}", my_string);
}

這里是游樂場。

但是你應該這樣做嗎？

正如Reddit 上的示例所討論的，這幾乎總是不推薦的這樣做的方法，因為remove和insert都是O(n)操作，如文檔中所述。

編輯

稍微好一點的版本：

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert_str(*p, "%20");
    }
    println!("{}", my_string);
}

和相應的Playground 。

有沒有辦法在 Rust 中更新字符串？

問題描述

2 個解決方案

解決方案1
2 已采納 2021-06-28 01:39:21

解決方案2
0 2021-06-27 17:09:33

編輯

有沒有辦法在 Rust 中更新字符串？

問題描述

2 個解決方案

解決方案1 2 已采納 2021-06-28 01:39:21

解決方案2 0 2021-06-27 17:09:33

編輯

解決方案1
2 已采納 2021-06-28 01:39:21

解決方案2
0 2021-06-27 17:09:33