简体   繁体   中英

Rust-polars: how can I pass the "other" parameter for the function "is_in" by reference?

I am very new to Rust so please excuse me if this is a trivial question.

I am trying to filter a dataframe as follows:

    let allowed = Series::from_iter(vec![
        "string1".to_string(),
        "string2".to_string(),
    ]);
    let df = LazyCsvReader::new(&fullpath)
        .has_header(true)
        .finish().unwrap()
        .filter(col("string_id").is_in(&allowed)).collect().unwrap(); 

It looks good to me since the signature of the is_in method looks like this:

fn is_in(
    &self,
    _other: &Series
) -> Result<ChunkedArray<BooleanType>, PolarsError>

from [https://docs.rs/polars/latest/polars/series/trait.SeriesTrait.html#method.is_in]

However, when I compile it I get the following error:

error[E0277]: the trait bound `Expr: From<&polars::prelude::Series>` is not satisfied
    --> src/main.rs:33:40
     |
33   |         .filter(col("string_id").is_in(&allowed)).collect().unwrap();
     |                                  ----- ^^^^^^^^ the trait `From<&polars::prelude::Series>` is not implemented for `Expr`
     |                                  |
     |                                  required by a bound introduced by this call
     |
     = help: the following other types implement trait `From<T>`:
               <Expr as From<&str>>
               <Expr as From<AggExpr>>
               <Expr as From<bool>>
               <Expr as From<f32>>
               <Expr as From<f64>>
               <Expr as From<i32>>
               <Expr as From<i64>>
               <Expr as From<u32>>
               <Expr as From<u64>>
     = note: required for `&polars::prelude::Series` to implement `Into<Expr>`
note: required by a bound in `polars_plan::dsl::<impl Expr>::is_in`
    --> /home/myself/.cargo/registry/src/
     |
1393 |     pub fn is_in<E: Into<Expr>>(self, other: E) -> Self {
     |                     ^^^^^^^^^^ required by this bound in `polars_plan::dsl::<impl Expr>::is_in`

For more information about this error, try `rustc --explain E0277`.

To me this error looks very cryptic. I read the result of rustc --explain E0277 that says "You tried to use a type which doesn't implement some trait in a place which expected that trait", but this doesn't help in the slightest to identify which type doesn't implement which trait.

  • How do I fix this? Why doesn't it work?

NOTE: I know that writing lit(allowed) instead of &allowed works, but this is not possible because it prevents using allowed anywhere else. For example, I would like to do the following, but the following code gets (obviously) an error "use of moved value":

    let df = LazyCsvReader::new(&fullpath)
        .has_header(true)
        .finish().unwrap()
        .with_column(
            when(
                col("firstcolumn").is_in(lit(allowed))
                    .and(
                    col("secondcolumn").is_in(lit(allowed))
                    )
                )
                .then(lit("very good"))
                .otherwise(lit("very bad"))
                .alias("good_bad")
        )
        .collect().unwrap();

Bonus questions:

  • Why does it work with lit(allowed) ? Shouldn't I pass the variable by reference as specified in the documentation?
  • How can I repeatedly use a Series for is_in like in the example above without having an error?

EDIT: I found a different signature for is_in requiring the second parameter to be a Expr, this would justify the need to use lit . However, it's still not clear how to use the same Series multiple times without getting the borrowed value error..

The signature is for Series.is_in() but you're using Expr.is_in() which differs.

You can usecols() to select multiple columns:

.with_columns([
    cols(["firstcolumn", "secondcolumn"]).is_in(lit(allowed))
])
┌─────────────┬──────────────┬─────────────┐
│ firstcolumn ┆ secondcolumn ┆ thirdcolumn │
│ ---         ┆ ---          ┆ ---         │
│ bool        ┆ bool         ┆ str         │
╞═════════════╪══════════════╪═════════════╡
│ false       ┆ false        ┆ moo         │
│ true        ┆ false        ┆ foo         │
│ true        ┆ true         ┆ keepme      │
│ true        ┆ true         ┆ andme       │
└─────────────┴──────────────┴─────────────┘

Used inside .when() - there is an implicit AND

┌─────────────┬──────────────┬─────────────┬───────────┐
│ firstcolumn ┆ secondcolumn ┆ thirdcolumn ┆ good_bad  │
│ ---         ┆ ---          ┆ ---         ┆ ---       │
│ str         ┆ str          ┆ str         ┆ str       │
╞═════════════╪══════════════╪═════════════╪═══════════╡
│ a           ┆ b            ┆ moo         ┆ very bad  │
│ string1     ┆ no           ┆ foo         ┆ very bad  │
│ string2     ┆ string1      ┆ keepme      ┆ very good │
│ string1     ┆ string2      ┆ andme       ┆ very good │
└─────────────┴──────────────┴─────────────┴───────────┘

With regards to the moved value error - I have little rust knowledge but the compiler tells me:

help: consider cloning the value if the performance cost is acceptable
   |
15 |                 col("firstcolumn").is_in(lit(allowed.clone())).and(col("secondcolumn").is_in(lit(allowed))))
   |                                                     ++++++++

And cloning a Series is a super cheap operation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM