[英]Is there a way to update a string in place in rust?
您也可以将其视为,是否可以在 rust 中对字符串进行 URL 化?
例如,
Problem statement: Replace whitespace with %20
Assumption: String will have enough capacity left to accommodate new characters.
Input: Hello how are you
Output: Hello%20how%20are%20you
我知道如果我们不必“就地”这样做,有办法做到这一点。 我正在解决一个明确指出您必须就地更新的问题。
如果没有任何安全的方法可以做到这一点,背后是否有任何特殊原因?
[编辑]
我能够使用unsafe
方法解决这个问题,但希望有比这更好的方法。 如果有的话,更惯用的方法。
fn space_20(sentence: &mut String) {
if !sentence.is_ascii() {
panic!("Invalid string");
}
let chars: Vec<usize> = sentence.char_indices().filter(|(_, ch)| ch.is_whitespace()).map(|(idx, _)| idx ).collect();
let char_count = chars.len();
if char_count == 0 {
return;
}
let sentence_len = sentence.len();
sentence.push_str(&"*".repeat(char_count*2)); // filling string with * so that bytes array becomes of required size.
unsafe {
let bytes = sentence.as_bytes_mut();
let mut final_idx = sentence_len + (char_count * 2) - 1;
let mut i = sentence_len - 1;
let mut char_ptr = char_count - 1;
loop {
if i != chars[char_ptr] {
bytes[final_idx] = bytes[i];
if final_idx == 0 {
// all elements are filled.
println!("all elements are filled.");
break;
}
final_idx -= 1;
} else {
bytes[final_idx] = '0' as u8;
bytes[final_idx - 1] = '2' as u8;
bytes[final_idx - 2] = '%' as u8;
// final_idx is of type usize cannot be less than 0.
if final_idx < 3 {
println!("all elements are filled at start.");
break;
}
final_idx -= 3;
// char_ptr is of type usize cannot be less than 0.
if char_ptr > 0 {
char_ptr -= 1;
}
}
if i == 0 {
// all elements are parsed.
println!("all elements are parsed.");
break;
}
i -= 1;
}
}
}
fn main() {
let mut sentence = String::with_capacity(1000);
sentence.push_str(" hello, how are you?");
// sentence.push_str("hello, how are you?");
// sentence.push_str(" hello, how are you? ");
// sentence.push_str(" ");
// sentence.push_str("abcd");
space_20(&mut sentence);
println!("{}", sentence);
}
使用std::mem::take
的O ( n ) 解决方案,既不使用unsafe
也不分配(前提是字符串具有足够的容量):
fn urlify_spaces(text: &mut String) {
const SPACE_REPLACEMENT: &[u8] = b"%20";
// operating on bytes for simplicity
let mut buffer = std::mem::take(text).into_bytes();
let old_len = buffer.len();
let space_count = buffer.iter().filter(|&&byte| byte == b' ').count();
let new_len = buffer.len() + (SPACE_REPLACEMENT.len() - 1) * space_count;
buffer.resize(new_len, b'\0');
let mut write_pos = new_len;
for read_pos in (0..old_len).rev() {
let byte = buffer[read_pos];
if byte == b' ' {
write_pos -= SPACE_REPLACEMENT.len();
buffer[write_pos..write_pos + SPACE_REPLACEMENT.len()]
.copy_from_slice(SPACE_REPLACEMENT);
} else {
write_pos -= 1;
buffer[write_pos] = byte;
}
}
*text = String::from_utf8(buffer).expect("invalid UTF-8 during URL-ification");
}
( 操场)
基本上,它计算字符串的最终长度,设置一个读指针和一个写指针,并将字符串从右向左翻译。 由于"%20"
字符多于" "
,所以写指针永远赶不上读指针。
是否有可能在没有unsafe
情况下做到这一点?
是这样的:
fn main() {
let mut my_string = String::from("Hello how are you");
let mut insert_positions = Vec::new();
let mut char_counter = 0;
for c in my_string.chars() {
if c == ' ' {
insert_positions.push(char_counter);
char_counter += 2; // Because we will insert two extra chars here later.
}
char_counter += 1;
}
for p in insert_positions.iter() {
my_string.remove(*p);
my_string.insert(*p, '0');
my_string.insert(*p, '2');
my_string.insert(*p, '%');
}
println!("{}", my_string);
}
这里是游乐场。
但是你应该这样做吗?
正如Reddit 上的示例所讨论的,这几乎总是不推荐的这样做的方法,因为remove
和insert
都是O(n)
操作,如文档中所述。
稍微好一点的版本:
fn main() {
let mut my_string = String::from("Hello how are you");
let mut insert_positions = Vec::new();
let mut char_counter = 0;
for c in my_string.chars() {
if c == ' ' {
insert_positions.push(char_counter);
char_counter += 2; // Because we will insert two extra chars here later.
}
char_counter += 1;
}
for p in insert_positions.iter() {
my_string.remove(*p);
my_string.insert_str(*p, "%20");
}
println!("{}", my_string);
}
和相应的Playground 。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.