简体   繁体   中英

Is there a way to update a string in place in rust?

You can also consider this as, is it possible to URLify a string in place in rust?

For example,


Problem statement: Replace whitespace with %20
Assumption: String will have enough capacity left to accommodate new characters.

Input: Hello how are you

Output: Hello%20how%20are%20you

I know there are ways to do this if we don't have to do this "in place". I am solving a problem that explicitly states that you have to update in place.

If there isn't any safe way to do this, is there any particular reason behind that?

[Edit]

I was able to solve this using unsafe approach, but would appreciate a better approach than this. More idiomatic approach if there is.

fn space_20(sentence: &mut String) {
  if !sentence.is_ascii() {
    panic!("Invalid string");
  }

  let chars: Vec<usize> = sentence.char_indices().filter(|(_, ch)| ch.is_whitespace()).map(|(idx, _)| idx ).collect();
  let char_count = chars.len();
  if char_count == 0 {
    return;
  }

  let sentence_len = sentence.len();
  sentence.push_str(&"*".repeat(char_count*2)); // filling string with * so that bytes array becomes of required size.

  unsafe {
    let bytes = sentence.as_bytes_mut();

    let mut final_idx = sentence_len + (char_count * 2) - 1;

    let mut i = sentence_len - 1;
    let mut char_ptr = char_count - 1;
    loop {
      if i != chars[char_ptr] {
        bytes[final_idx] = bytes[i];
        if final_idx == 0 {
          // all elements are filled.
          println!("all elements are filled.");
          break;
        }
        final_idx -= 1;

      } else {
        bytes[final_idx] = '0' as u8;
        bytes[final_idx - 1] = '2' as u8;
        bytes[final_idx - 2] = '%' as u8;

        // final_idx is of type usize cannot be less than 0.
        if final_idx < 3 {
          println!("all elements are filled at start.");
          break;
        }

        final_idx -= 3;

        // char_ptr is of type usize cannot be less than 0.
        if char_ptr > 0 {
          char_ptr -= 1;
        }
      }

      if i == 0 {
        // all elements are parsed.
        println!("all elements are parsed.");
        break;
      }

      i -= 1;
    }

  }

}


fn main() {
  let mut sentence = String::with_capacity(1000);
  sentence.push_str("  hello, how are you?");
  // sentence.push_str("hello, how  are you?");
  // sentence.push_str(" hello, how are you? ");
  // sentence.push_str("  ");
  // sentence.push_str("abcd");

  space_20(&mut sentence);
  println!("{}", sentence);
}

An O ( n ) solution that neither uses unsafe nor allocates (provided that the string has enough capacity), using std::mem::take :

fn urlify_spaces(text: &mut String) {
    const SPACE_REPLACEMENT: &[u8] = b"%20";

    // operating on bytes for simplicity
    let mut buffer = std::mem::take(text).into_bytes();
    let old_len = buffer.len();

    let space_count = buffer.iter().filter(|&&byte| byte == b' ').count();
    let new_len = buffer.len() + (SPACE_REPLACEMENT.len() - 1) * space_count;
    buffer.resize(new_len, b'\0');

    let mut write_pos = new_len;

    for read_pos in (0..old_len).rev() {
        let byte = buffer[read_pos];

        if byte == b' ' {
            write_pos -= SPACE_REPLACEMENT.len();
            buffer[write_pos..write_pos + SPACE_REPLACEMENT.len()]
                .copy_from_slice(SPACE_REPLACEMENT);
        } else {
            write_pos -= 1;
            buffer[write_pos] = byte;
        }
    }

    *text = String::from_utf8(buffer).expect("invalid UTF-8 during URL-ification");
}

( playground )

Basically, it calculates the final length of the string, sets up a reading pointer and a writing pointer, and translates the string from right to left. Since "%20" has more characters than " " , the writing pointer never catches up with the reading pointer.

Is it possible to do this without unsafe ?

Yes like this:

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert(*p, '0');
        my_string.insert(*p, '2');
        my_string.insert(*p, '%');
    }
    println!("{}", my_string);
}

Here is the Playground .

But should you do it?

As discussed for example here on Reddit this is almost always not the recommended way of doing this, because both remove and insert are O(n) operations as noted in the documentation.

Edit

A slightly better version:

fn main() {
    let mut my_string = String::from("Hello how are you");
    
    let mut insert_positions = Vec::new();
    
    let mut char_counter = 0;
    for c in my_string.chars() {
        if c == ' ' {
            insert_positions.push(char_counter);
            char_counter += 2; // Because we will insert two extra chars here later.
        }
        char_counter += 1;
    }
    
    for p in insert_positions.iter() {
        my_string.remove(*p);
        my_string.insert_str(*p, "%20");
    }
    println!("{}", my_string);
}

and the corresponding Playground .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM