简体   繁体   中英

How to pattern match a transformed Vec<String> in Rust idiomatically?

Given a variable tokens: &Vec<String> , I would like to do the following:

  1. Validate it: a valid tokens must start with "abc" , "def" and end with "xyz" (type-coersedly speaking?), and the equality is case-insensitive.
  2. In case of a valid tokens , extract the rest of the tokens for later processing.

What I tried:

fn process_tokens(tokens: &Vec<String>) -> Result<(), &str> {
    let lowercased_tokens: Vec<String> = tokens.iter().map(|s| s.to_lowercase()).collect();
    match lowercased_tokens.iter().map(|s| s as &str).collect::<Vec<_>>().as_slice() {
        ["abc", "def", remaining_tokens @ .., "xyz"] => { 
            // do something with remaining_tokens
            Ok(())
        }
        _ => Err ("Invalid tokens!")
    }
}

My problem with this:

  1. A temporary vector lowercased_tokens was created, which seems performance-wise not ideal
  2. Verbose

However, I have difficulties finding how to achieve what I want without:

  1. Without the temporary vector, the lowercased tokens generated in the method chaining is "temporary" and we cannot apply |s| s as &str |s| s as &str to them.
  2. In the other direction, I don't know how to pattern match a Vec<String> , ie, putting "abc".to_string() instead of "abc" in the matching pattern (which is invalid syntax).

My suggestion would be to work with a slice of strings ( &[String] ) 1 , as you can "peel off" the elements you are validating. We can wrap this up into two functions: one that expects something at the start and one that expects something at the end. The comparisons can be done case-insensitively to prevent allocation of additional strings. Both functions will return a subslice with the expected element removed.

Doing it this way requires no additional allocations, as the slice is just a fat pointer into the vector's own allocation.

fn expect_start<'a>(expected: &'_ str, strs: &'a [String])
    -> Result<&'a [String], &'static str>
{
    match strs.first() {
        Some(v) if expected.eq_ignore_ascii_case(v) => Ok(&strs[1..]),
        _ => Err("unexpected token at start of string"),
    }
}

fn expect_end<'a>(expected: &'_ str, strs: &'a [String])
    -> Result<&'a [String], &'static str>
{
    match strs.last() {
        Some(v) if expected.eq_ignore_ascii_case(v) => Ok(&strs[..(strs.len() - 1)]),
        _ => Err("unexpected token at end of string"),
    }
}

Now we can easily combine these into the process_tokens function, bailing whenever we encounter an error (using ? ):

fn process_tokens(mut tok: &[String]) -> Result<(), &'static str> {
    tok = expect_start("abc", tok)?;
    tok = expect_start("def", tok)?;
    tok = expect_end("xyz", tok)?;
    
    // Do something with tok
    println!("Remaining tokens: {:?}", tok);
    Ok(())
}

( Playground )


1 These functions could work with &str just as easily as String , but accepting a slice of &str would require creating a second vector to hold the slices referencing the strings in the first vector. However, you can alter these functions to accept slices of String or &str by accepting &[T] where T: Borrow<str> , as both String and &str implement this trait.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM