简体   繁体   中英

Rust lifetimes, data flows into other references

I wrote the following code that filters a stream of data which worked fine until I changed from parsing simple numbers to also have types that are bound to lifetimes like &str and &[u8] .

use wirefilter::{ExecutionContext, Filter, Scheme};

lazy_static::lazy_static! {
    static ref SCHEME: Scheme = Scheme! {
        port: Int,
        name: Bytes,
    };
}

#[derive(Debug)]
struct MyStruct {
    port: i32,
    name: String,
}

impl MyStruct {
    fn scheme() -> &'static Scheme {
        &SCHEME
    }

    fn filter_matches<'s>(&self, filter: &Filter<'s>) -> bool {
        let mut ctx = ExecutionContext::new(Self::scheme());
        ctx.set_field_value("port", self.port).unwrap();
        ctx.set_field_value("name", self.name.as_str()).unwrap();

        filter.execute(&ctx).unwrap()
    }
}

fn main() -> Result<(), failure::Error> {
    let data = expensive_data_iterator();
    let scheme = MyStruct::scheme();
    let filter = scheme
        .parse("port in {2 5} && name matches \"http.*\"")?
        .compile();

    for my_struct in data
        .filter(|my_struct| my_struct.filter_matches(&filter))
        .take(2)
    {
        println!("{:?}", my_struct);
    }

    Ok(())
}

fn expensive_data_iterator() -> impl Iterator<Item = MyStruct> {
    (0..).map(|port| MyStruct {
        port,
        name: format!("http {}", port % 2),
    })
}

If I try to compile it the compiler will fail with this:

error[E0623]: lifetime mismatch
  --> src/main.rs:26:16
   |
21 |     fn filter_matches<'s>(&self, filter: &Filter<'s>) -> bool {
   |                           -----           ----------
   |                           |
   |                           these two types are declared with different lifetimes...
...
26 |         filter.execute(&ctx).unwrap()
   |                ^^^^^^^ ...but data from `self` flows into `filter` here

error: aborting due to previous error

error: Could not compile `wirefilter_playground`.

To learn more, run the command again with --verbose.

Process finished with exit code 101

my first thought was that self and filter should have the same lifetime in fn filter_matches<'s>(&self, filter: &Filter<'s>) -> bool but if I change the signature to fn filter_matches<'s>(&'s self, filter: &Filter<'s>) -> bool I will start getting this error:

error: borrowed data cannot be stored outside of its closure
  --> src/main.rs:38:29
   |
33 |     let filter = scheme
   |         ------ ...so that variable is valid at time of its declaration
...
38 |         .filter(|my_struct| my_struct.filter_matches(&filter))
   |                 ----------- ^^^^^^^^^ -------------- cannot infer an appropriate lifetime...
   |                 |           |
   |                 |           cannot be stored outside of its closure
   |                 borrowed data cannot outlive this closure

error: aborting due to previous error

error: Could not compile `wirefilter_playground`.

To learn more, run the command again with --verbose.

Process finished with exit code 101

I am failing to understand the reason, Filter<'s> is bound to SCHEME which is lazily generated and is bound to 'static which makes sense not allowing filter.execute to take reference to &self.name.as_str() because it would be outlived but, isn't filter.execute(&ctx) which the signature is pub fn execute(&self, ctx: &ExecutionContext<'s>) -> Result<bool, SchemeMismatchError> supposed to drop the references as soon as it finishes as the result of it has not other lifetimes?

In order to try and compile the code above, you can use this Cargo.toml :

[package]
name = "wirefilter_playground"
version = "0.1.0"
edition = "2018"

[dependencies]
wirefilter-engine = "0.6.1"
failure = "0.1.5"
lazy_static = "1.3.0"

PS: That could be solved by compiling the as inside filter_matches method but that would be sort of bad because the user would only get the parse error when trying to filter and it could potentially be slower.

I see 2 ways to solve this problem:
1) extend lifetime of self.name . This can be achieved by collecting expensive_data_iterator into, say, Vec.

--- let data = expensive_data_iterator();
+++ let data: Vec<_> = expensive_data_iterator().collect();

2) reduce lifetime of filter .

--- let filter = scheme.parse("...")?.compile();
+++ let filter = scheme.parse("...")?;

--- .filter(|my_struct| my_struct.filter_matches(&filter))
+++ .filter(|my_struct| my_struct.filter_matches(&filter.clone().compile()))

I omitted some other minor changes. And yes, filter_matches<'s>(&'s self, ...) is mandatory in either case.

PS yes, 2nd option works because my_struct outlives filter . Well, if both approaches are somewhat bad, then you can combine them! Process data by chunks, collecting each one into vector.

const N: usize = 10; // or any other size
loop {
    let cur_chunk: Vec<_> = data.by_ref().take(N).collect();
    if cur_chunk.is_empty() {
        break;
    }
    let cur_filter = filter.clone().compile();
    // etc
}

it uses only O(N) memory and compiles filter N times less

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM