简体   繁体   English

当var与列同名时如何过滤column==var? (在 pmap 内)

[英]how to filter on column==var when var has same name as column? (inside pmap)

I have a tibble that I want to filter by comparing its columns against some variables.我有一个tibble ,我想通过将其列与一些变量进行比较来filter它。 However, it's convenient for that variable to have the same name as the column.但是,该变量与列具有相同的名称很方便。 How can I force dplyr to evaluate the variable so it doesn't confuse the variable and column names?如何强制 dplyr 评估变量,以免混淆变量和列名?

set.seed(2)
ngrp <- 3
npergrp <- 4
tib <- tibble(grp=rep(letters[1:ngrp], each=npergrp), 
              N=rep(1:npergrp, ngrp), 
              val=round(runif(npergrp*ngrp))) %>% print(n=Inf)
grp <- grp_ <- 'a'
tib %>% dplyr::filter(grp==grp_) %>% glimpse() ## works
tib %>% dplyr::filter(grp==grp) %>% glimpse()  ## undesired result, grp==grp always true
tib %>% dplyr::filter(grp=={{grp}}) %>% glimpse()  ## hey it works!
## slightly less toy example
tib %>% dplyr::filter(grp==grp_) %>% 
  dplyr::mutate(
    the_rest = purrr::pmap(
      .,
      function(grp, N, ...) {
        gg <- grp ## there must be a better way
        NN <- N
        tib %>% 
          dplyr::filter(
            # grp!=grp, ## always false
            # N==N      ## always true
            grp!=gg,
            N==NN
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    ),
    no_hugs = purrr::pmap(
      .,
      function(grp, N, ...) {
        tib %>% 
          dplyr::filter(
            grp!={{grp}}, ## ERROR! oh noes!
            N=={{N}}
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    )
  ) %>% 
  tidyr::unnest() %>% 
  glimpse()

output:输出:

# A tibble: 12 × 3
   grp       N   val
   <chr> <int> <dbl>
 1 a         1     0
 2 a         2     1
 3 a         3     1
 4 a         4     0
 5 b         1     1
 6 b         2     1
 7 b         3     0
 8 b         4     1
 9 c         1     0
10 c         2     1
11 c         3     1
12 c         4     0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Error in local_error_context(dots = dots, .index = i, mask = mask) : 
promise already under evaluation: recursive default argument reference or earlier problems?

# the_rest should be 1, 2, 1, 1

As often happens, writing the question taught me how to embrace variables using the double curly brace operator {{}} https://dplyr.tidyverse.org/articles/programming.html Use dynamic name for new column/variable in `dplyr`正如经常发生的那样,写这个问题教我如何使用双花括号运算符{{}}包含变量https://dplyr.tidyverse.org/articles/programming.html 在`dplyr`中为新列/变量使用动态名称

However, it doesn't work inside the pmap.但是,它在 pmap 中不起作用。

It would need .env to evaluate the object 'grp' from the environment other than the data environment (or use !! )它需要.env从数据环境以外的环境中评估对象“grp”(或使用!!

library(dplyr)
tib %>% 
   dplyr::filter(grp==.env$grp)

-output -输出

# A tibble: 4 × 3
  grp       N   val
  <chr> <int> <dbl>
1 a         1     0
2 a         2     1
3 a         3     1
4 a         4     0

The .env can be used similarly within the pmap code as well .env也可以在pmap代码中类似地使用

library(purrr)
tib %>%
  dplyr::filter(grp==.env$grp_) %>%
  dplyr::mutate(the_rest = purrr::pmap_dbl(across(everything()), 
     ~ {gg <- ..1
       NN <- ..2
      tib %>%
       dplyr::filter(grp != gg, N == NN) %>%
        pull(val) %>% 
        sum()}), 
   no_hugs = purrr::pmap_dbl(across(all_of(names(tib))),
     ~ tib %>% 
     dplyr::filter(grp != .env$grp, N == ..2) %>%
     pull(val) %>% 
     sum()))

-output -输出

# A tibble: 4 × 5
  grp       N   val the_rest no_hugs
  <chr> <int> <dbl>    <dbl>   <dbl>
1 a         1     0        1       1
2 a         2     1        2       2
3 a         3     1        1       1
4 a         4     0        1       1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM