R：从多列创建基于新列的值列表

Question

我想根据多个列中存在的列表中的任何值创建一个新列 (T/F)。 对于这个例子，我使用 mtcars 作为我的例子，在两列中搜索两个值，但我的实际挑战是许多列中有很多值。

我使用下面包含的filter_at()有一个成功的过滤器，但我一直无法将该逻辑应用于 mutate：

# there are 7 cars with 6 cyl
mtcars %>%
  filter(cyl == 6)

# there are 2 cars with 19.2 mpg, one with 6 cyl, one with 8
mtcars %>% 
  filter(mpg == 19.2)

# there are 8 rows with either.
# these are the rows I want as TRUE
mtcars %>% 
  filter(mpg == 19.2 | cyl == 6)

# set the cols to look at
mtcars_cols <- mtcars %>% 
  select(matches('^(mp|cy)')) %>% names()

# set the values to look at
mtcars_numbs <- c(19.2, 6)

# result is 8 vars with either value in either col.
# this is a successful filter of the data
out1 <- mtcars %>% 
    filter_at(vars(mtcars_cols), any_vars(
        . %in% mtcars_numbs
        )
      )

# shows set with all 6 cyl, plus one 8cyl 21.9 mpg
out1 %>% 
  select(mpg, cyl)

# This attempts to apply the filter list to the cols,
# but I only get 6 rows as True
# I tried to change == to %in& but that results in an error
out2 <- mtcars %>%
    mutate(
      myset = rowSums(select(., mtcars_cols) == mtcars_numbs) > 0
    )

# only 6 rows returned
out2 %>% 
  filter(myset == T)

我不确定为什么跳过这两行。 我认为可能是使用rowSums以某种方式聚合了这两行。

Answer 1

如果我们要做相应的检查，使用map2可能更好

 library(dplyr)
 library(purrr)
 map2_df(mtcars_cols, mtcars_numbs, ~ 
       mtcars %>%
           filter(!! rlang::sym(.x) == .y)) %>%
     distinct

注意：与浮点数进行比较 ( == ) 可能会遇到麻烦，因为精度可能会有所不同并导致 FALSE

另外，请注意==仅在lhs和rhs元素具有相同长度或rhs向量的length 1 时才起作用（这里发生了回收）。 如果length大于 1 且不等于 lhs 向量的长度，则回收将按列顺序进行比较。

我们可以rep licate使长度相等，现在它应该工作

mtcars %>%
 mutate(
   myset = rowSums(select(., mtcars_cols) == mtcars_numbs[col(select(., mtcars_cols))]) > 0
   ) %>% pull(myset) %>% sum
#[1] 8

在上面的代码中，为了更好的理解使用了两次select 。 否则，我们也可以使用rep

mtcars %>%
 mutate(
   myset = rowSums(select(., mtcars_cols) == rep(mtcars_numbs, each = n())) > 0
    ) %>% 
   pull(myset) %>%
   sum
#[1] 8

R：从多列创建基于新列的值列表

问题描述

1 个解决方案

解决方案1
1 2020-02-09 01:53:39

R：从多列创建基于新列的值列表

问题描述

1 个解决方案

解决方案1 1 2020-02-09 01:53:39

解决方案1
1 2020-02-09 01:53:39