简体   繁体   中英

how to filter on column==var when var has same name as column? (inside pmap)

I have a tibble that I want to filter by comparing its columns against some variables. However, it's convenient for that variable to have the same name as the column. How can I force dplyr to evaluate the variable so it doesn't confuse the variable and column names?

set.seed(2)
ngrp <- 3
npergrp <- 4
tib <- tibble(grp=rep(letters[1:ngrp], each=npergrp), 
              N=rep(1:npergrp, ngrp), 
              val=round(runif(npergrp*ngrp))) %>% print(n=Inf)
grp <- grp_ <- 'a'
tib %>% dplyr::filter(grp==grp_) %>% glimpse() ## works
tib %>% dplyr::filter(grp==grp) %>% glimpse()  ## undesired result, grp==grp always true
tib %>% dplyr::filter(grp=={{grp}}) %>% glimpse()  ## hey it works!
## slightly less toy example
tib %>% dplyr::filter(grp==grp_) %>% 
  dplyr::mutate(
    the_rest = purrr::pmap(
      .,
      function(grp, N, ...) {
        gg <- grp ## there must be a better way
        NN <- N
        tib %>% 
          dplyr::filter(
            # grp!=grp, ## always false
            # N==N      ## always true
            grp!=gg,
            N==NN
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    ),
    no_hugs = purrr::pmap(
      .,
      function(grp, N, ...) {
        tib %>% 
          dplyr::filter(
            grp!={{grp}}, ## ERROR! oh noes!
            N=={{N}}
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    )
  ) %>% 
  tidyr::unnest() %>% 
  glimpse()

output:

# A tibble: 12 × 3
   grp       N   val
   <chr> <int> <dbl>
 1 a         1     0
 2 a         2     1
 3 a         3     1
 4 a         4     0
 5 b         1     1
 6 b         2     1
 7 b         3     0
 8 b         4     1
 9 c         1     0
10 c         2     1
11 c         3     1
12 c         4     0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Error in local_error_context(dots = dots, .index = i, mask = mask) : 
promise already under evaluation: recursive default argument reference or earlier problems?

# the_rest should be 1, 2, 1, 1

As often happens, writing the question taught me how to embrace variables using the double curly brace operator {{}} https://dplyr.tidyverse.org/articles/programming.html Use dynamic name for new column/variable in `dplyr`

However, it doesn't work inside the pmap.

It would need .env to evaluate the object 'grp' from the environment other than the data environment (or use !! )

library(dplyr)
tib %>% 
   dplyr::filter(grp==.env$grp)

-output

# A tibble: 4 × 3
  grp       N   val
  <chr> <int> <dbl>
1 a         1     0
2 a         2     1
3 a         3     1
4 a         4     0

The .env can be used similarly within the pmap code as well

library(purrr)
tib %>%
  dplyr::filter(grp==.env$grp_) %>%
  dplyr::mutate(the_rest = purrr::pmap_dbl(across(everything()), 
     ~ {gg <- ..1
       NN <- ..2
      tib %>%
       dplyr::filter(grp != gg, N == NN) %>%
        pull(val) %>% 
        sum()}), 
   no_hugs = purrr::pmap_dbl(across(all_of(names(tib))),
     ~ tib %>% 
     dplyr::filter(grp != .env$grp, N == ..2) %>%
     pull(val) %>% 
     sum()))

-output

# A tibble: 4 × 5
  grp       N   val the_rest no_hugs
  <chr> <int> <dbl>    <dbl>   <dbl>
1 a         1     0        1       1
2 a         2     1        2       2
3 a         3     1        1       1
4 a         4     0        1       1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM