繁体   English   中英

当var与列同名时如何过滤column==var? (在 pmap 内)

[英]how to filter on column==var when var has same name as column? (inside pmap)

我有一个tibble ,我想通过将其列与一些变量进行比较来filter它。 但是,该变量与列具有相同的名称很方便。 如何强制 dplyr 评估变量,以免混淆变量和列名?

set.seed(2)
ngrp <- 3
npergrp <- 4
tib <- tibble(grp=rep(letters[1:ngrp], each=npergrp), 
              N=rep(1:npergrp, ngrp), 
              val=round(runif(npergrp*ngrp))) %>% print(n=Inf)
grp <- grp_ <- 'a'
tib %>% dplyr::filter(grp==grp_) %>% glimpse() ## works
tib %>% dplyr::filter(grp==grp) %>% glimpse()  ## undesired result, grp==grp always true
tib %>% dplyr::filter(grp=={{grp}}) %>% glimpse()  ## hey it works!
## slightly less toy example
tib %>% dplyr::filter(grp==grp_) %>% 
  dplyr::mutate(
    the_rest = purrr::pmap(
      .,
      function(grp, N, ...) {
        gg <- grp ## there must be a better way
        NN <- N
        tib %>% 
          dplyr::filter(
            # grp!=grp, ## always false
            # N==N      ## always true
            grp!=gg,
            N==NN
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    ),
    no_hugs = purrr::pmap(
      .,
      function(grp, N, ...) {
        tib %>% 
          dplyr::filter(
            grp!={{grp}}, ## ERROR! oh noes!
            N=={{N}}
          ) %>% 
          dplyr::pull(val) %>% 
          sum()
      }
    )
  ) %>% 
  tidyr::unnest() %>% 
  glimpse()

输出:

# A tibble: 12 × 3
   grp       N   val
   <chr> <int> <dbl>
 1 a         1     0
 2 a         2     1
 3 a         3     1
 4 a         4     0
 5 b         1     1
 6 b         2     1
 7 b         3     0
 8 b         4     1
 9 c         1     0
10 c         2     1
11 c         3     1
12 c         4     0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Rows: 4
Columns: 3
$ grp <chr> "a", "a", "a", "a"
$ N   <int> 1, 2, 3, 4
$ val <dbl> 0, 1, 1, 0
Error in local_error_context(dots = dots, .index = i, mask = mask) : 
promise already under evaluation: recursive default argument reference or earlier problems?

# the_rest should be 1, 2, 1, 1

正如经常发生的那样,写这个问题教我如何使用双花括号运算符{{}}包含变量https://dplyr.tidyverse.org/articles/programming.html 在`dplyr`中为新列/变量使用动态名称

但是,它在 pmap 中不起作用。

它需要.env从数据环境以外的环境中评估对象“grp”(或使用!!

library(dplyr)
tib %>% 
   dplyr::filter(grp==.env$grp)

-输出

# A tibble: 4 × 3
  grp       N   val
  <chr> <int> <dbl>
1 a         1     0
2 a         2     1
3 a         3     1
4 a         4     0

.env也可以在pmap代码中类似地使用

library(purrr)
tib %>%
  dplyr::filter(grp==.env$grp_) %>%
  dplyr::mutate(the_rest = purrr::pmap_dbl(across(everything()), 
     ~ {gg <- ..1
       NN <- ..2
      tib %>%
       dplyr::filter(grp != gg, N == NN) %>%
        pull(val) %>% 
        sum()}), 
   no_hugs = purrr::pmap_dbl(across(all_of(names(tib))),
     ~ tib %>% 
     dplyr::filter(grp != .env$grp, N == ..2) %>%
     pull(val) %>% 
     sum()))

-输出

# A tibble: 4 × 5
  grp       N   val the_rest no_hugs
  <chr> <int> <dbl>    <dbl>   <dbl>
1 a         1     0        1       1
2 a         2     1        2       2
3 a         3     1        1       1
4 a         4     0        1       1

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM