简体   繁体   中英

How can I make my Rust function more generic & efficient?

I have a function that works, but is more specialized than I'd like and has some inefficiencies, I'd like to fix that.

The working but flawed function:

fn iter_to_min<T>(i:T) -> i64 where T:Iterator<Item=String>{
    i.collect::<Vec<String>>()
        .iter()
        .flat_map(|s|s.split_whitespace())
        .map(str::trim)
        .map(str::parse::<i64>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

The reasons I dislike this implementation are:

  • i64 is hard coded and I'd like to reuse this function for u64 and possibly other return types
  • it collects it's input just to iterate over it immediately which is inefficient (heap allocation for no reason)
  • the closure passed to flat_map might not be optimized away by LLVM in all cases

and the closest I could get to my ideal function:

use std::str::FromStr;

fn iter_to_min<T,U>(i:T) -> U where T:Iterator<Item=String>,U: Ord+FromStr{
    i.flat_map(str::split_whitespace)
        .map(str::trim)
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

The problems I am seeing are:

  • the argument passed to str::split_whitespace is a String and won't coerce to a str
  • the argument passed to str::split_whitespace isn't known to live long enough
  • Result::unwrap doesn't complains that the trait core::fmt::Debug is not implemented for the type <U as core::str::FromStr>::Err

I think that with clever lifetime notation and Trait requirements at least two of those could be fixed, and who knows maybe there's a way to go three for three.

Example code using some suggested fixes:

use std::io::BufRead;
use std::str::FromStr;
use std::fmt::Debug;

fn iter_to_min<T,U>(i:T) -> U where T:Iterator<Item=String>,U: Ord+FromStr, U::Err: Debug{
    i.collect::<Vec<String>>()
        .iter()
        .flat_map(|s|s.split_whitespace())
        .map(str::trim)
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found.")
}

fn main() {
    let a: Vec<_> = std::env::args().skip(1).collect();
    let m:i64 = if a.is_empty() {
        let s = std::io::stdin();
        let m = iter_to_min(s.lock().lines().map(Result::unwrap));
        m
    }else{
        iter_to_min(a.into_iter())
    };
    println!("{}", m);
}

Unfortunately, there is no way to do what you want when keeping both the genericity and the absence of allocations. The reason is that you need someone to own your string data, but if flat_map(str::split_whitespace) is performed over an iterator of owned strings, then there is no one to keep these owned strings anymore because str::split_whitespace only borrows the string it is called on. However, if you "push" the ownership up the call chain, you won't be able to be fully generic and accept owned strings by value.

Therefore, there are two solutions: to precollect the entire iterator to a Vec<String> (or to transform the items yielded by split_whitespace() to owned strings separately and collect them to a Vec again), or to accept an iterator of references.

Here is the most generic version I could come up with of the first solution:

use std::str::FromStr;
use std::fmt::Debug;

fn iter_to_min<S, T, U>(i: T) -> U
    where S: Into<String>,
          T: IntoIterator<Item=S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .map(Into::into)
        .collect::<Vec<_>>()
        .iter()
        .flat_map(|s| s.split_whitespace())
        .map(str::parse::<U>)
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

(try it here )

It is basically the same as your first one but more generic. Also note that you don't need to trim the parts of the string after split_whitespace() - the latter will make sure that the parts of the string do not have spaces at their sides. Into<String> bound allows one to pass both &str and String iterators, and in the latter case no extra copies will be done.

Alternatively, you can split each line into owned strings separately:

fn iter_to_min<S, T, U>(i: T) -> U
    where S: AsRef<str>,
          T: IntoIterator<Item=S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .flat_map(|s| s.as_ref().split_whitespace().map(String::from).collect::<Vec<_>>())
        .map(|s| s.parse::<U>())
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

Here we only need to obtain a &str s from iterator items, not String s, so I used AsRef<str> . However, each line must not only be converted to a sequence of String s; this sequence must be collected into a vector for exactly the same reason described above - otherwise there would be no one to keep the original values of type S from destruction.

But it is possible to avoid the .map(String::from).collect::<Vec<_>>() if you are willing to lose some genericity, though. This is the second solution I've mentioned above. We can accept an iterator over references:

fn iter_to_min<'a, S: ?Sized, T, U>(i: T) -> U
    where S: AsRef<str> + 'a,
          T: IntoIterator<Item=&'a S>,
          U: Ord + FromStr,
          U::Err: Debug
{
    i.into_iter()
        .map(AsRef::as_ref)
        .flat_map(str::split_whitespace)
        .map(|s| s.parse::<U>())
        .map(Result::unwrap)
        .min()
        .expect("No min found")
}

(try it here )

Roughly speaking, now S values are owned by someone else, and their lifetime is bigger than the scope of iter_to_min() , so you need neither to transform each part to String nor to collect the entire split result to a Vec<String> . However, you won't be able to pass a Vec<String> to this function; you will be able to pass vec.iter() , however:

let v: Vec<String> = vec!["0".into(), "1".into()];
iter_to_min(v.iter())

In all these examples I've changed Iterator to IntoIterator - this is almost always what you want to use instead of just Iterator . It allows you, for example, to pass collections to such functions directly. Second, I've added U::Err: Debug condition, which is necessary for Result::unwrap to work. And finally, to fix the problem with "String not coercing to &str` you can always use explicit closures and method syntax which would do this coercion for you.

A solution without extra allocations

use std::str::FromStr;

fn iter_to_min<T, U>(i: T) -> Option<U>
    where T: Iterator<Item = String>,
          U: Ord + FromStr
{
    i.filter_map(|string| {
            string.split_whitespace()
                .map(str::trim)
                .map(str::parse::<U>)
                .filter_map(Result::ok)
                .min()
        })
        .min()
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM