dplyr：根据变量字符串选择的多个列来更改新列

Question

Given this data: 给定此数据：

df=data.frame(
  x1=c(2,0,0,NA,0,1,1,NA,0,1),
  x2=c(3,2,NA,5,3,2,NA,NA,4,5),
  x3=c(0,1,0,1,3,0,NA,NA,0,1),
  x4=c(1,0,NA,3,0,0,NA,0,0,1),
  x5=c(1,1,NA,1,3,4,NA,3,3,1))

I want to create an extra column min for the rowwise minimal value of selected columns using dplyr. 我想使用dplyr为选定列的行min创建一个额外的列min 。 That's easy using the column names: 使用列名很容易：

df <- df %>% rowwise() %>% mutate(min = min(x2,x5))

But I have a large df with varying column names so I need to match them from some string of values mycols . 但是我有一个很大的df，具有不同的列名，因此我需要从一些字符串mycols匹配它们。 Now other threads tell me to use select helper functions, but I must be missing something. 现在其他线程告诉我要使用选择帮助器功能，但是我一定缺少一些东西。 Here's matches : matches ：

mycols <- c("x2","x5")
df <- df %>% rowwise() %>%
  mutate(min = min(select(matches(mycols))))
Error: is.string(match) is not TRUE

And one_of : 和one_of ：

mycols <- c("x2","x5")
 df <- df %>%
 rowwise() %>%
 mutate(min = min(select(one_of(mycols))))
Error: no applicable method for 'select' applied to an object of class "c('integer', 'numeric')"
In addition: Warning message:
In one_of(c("x2", "x5")) : Unknown variables: `x2`, `x5`

What am I overlooking? 我在俯视什么？ Should select_ work? 应该select_工作？ It doesn't in the following: 它不在以下内容中：

df <- df %>%
   rowwise() %>%
   mutate(min = min(select_(mycols)))
Error: no applicable method for 'select_' applied to an object of class "character"

And likewise: 同样：

df <- df %>%
  rowwise() %>%
  mutate(min = min(select_(matches(mycols))))
Error: is.string(match) is not TRUE

Answer 1

Here's another solution a bit technical with the help of purrr package from the tidyverse designed for functional programming. 这是从tidyverse设计用于函数式编程的purrr软件包的帮助下的另一种技术解决方案。

Fist, matches helpers from dplyr takes a regex string as argument not a vector. 拳头，来自dplyr matches助手使用正则表达式字符串作为参数而不是向量。 It is a good way for you to find a regex that matches all your columns. 这是找到与所有列匹配的正则表达式的好方法。 (in the code under you can use the dplyr select helper that you wish) （在下面的代码中，您可以使用所需的dplyr select帮助器）

Then, purrr functions works great with dplyr when you understand the underlying scheme of functionnal programming. 然后，当您了解函数式编程的基本方案时， purrr函数可与dplyr一起使用。

Solution to your problem : 解决问题的方法：

df=data.frame(
  x1=c(2,0,0,NA,0,1,1,NA,0,1),
  x2=c(3,2,NA,5,3,2,NA,NA,4,5),
  x3=c(0,1,0,1,3,0,NA,NA,0,1),
  x4=c(1,0,NA,3,0,0,NA,0,0,1),
  x5=c(1,1,NA,1,3,4,NA,3,3,1))


# regex to get only x2 and x5 column
mycols <- "x[25]"

library(dplyr)

df %>%
  mutate(min_x2_x5 =
           # select columns that you want in df
           select(., matches(mycols)) %>% 
           # use pmap on this subset to get a vector of min from each row.
           # dataframe is a list so pmap works on each element of the list that is to say each row
           purrr::pmap_dbl(min)
         )
#>    x1 x2 x3 x4 x5 min_x2_x5
#> 1   2  3  0  1  1         1
#> 2   0  2  1  0  1         1
#> 3   0 NA  0 NA NA        NA
#> 4  NA  5  1  3  1         1
#> 5   0  3  3  0  3         3
#> 6   1  2  0  0  4         2
#> 7   1 NA NA NA NA        NA
#> 8  NA NA NA  0  3        NA
#> 9   0  4  0  0  3         3
#> 10  1  5  1  1  1         1

I won't explain further about purrr here but it works fine in your case 我不会在这里进一步解释有关purrr信息，但在您的情况下效果很好

Answer 2

This was a bit trickier. 这有点棘手。 In case of SE evaluation you'd need to pass the operation as string. 对于SE评估，您需要将操作作为字符串传递。

mycols <- '(x2,x5)'
f <- paste0('min',mycols)
df %>% rowwise() %>% mutate_(min = f)
df
# A tibble: 10 × 6
#      x1    x2    x3    x4    x5   min
#   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#1      2     3     0     1     1     1
#2      0     2     1     0     1     1
#3      0    NA     0    NA    NA    NA
#4     NA     5     1     3     1     1
#5      0     3     3     0     3     3
#6      1     2     0     0     4     2
#7      1    NA    NA    NA    NA    NA
#8     NA    NA    NA     0     3    NA
#9      0     4     0     0     3     3
#10     1     5     1     1     1     1

dplyr：根据变量字符串选择的多个列来更改新列

问题描述

2 个解决方案

解决方案1
4 2017-02-19 21:37:30

解决方案2
2 已采纳 2017-02-19 20:55:24

dplyr：根据变量字符串选择的多个列来更改新列

问题描述

2 个解决方案

解决方案1 4 2017-02-19 21:37:30

解决方案2 2 已采纳 2017-02-19 20:55:24

解决方案1
4 2017-02-19 21:37:30

解决方案2
2 已采纳 2017-02-19 20:55:24