[英]dplyr mutate multiple columns based on names in vectors
I want to multiply two columns with each other by using dplyr's mutate
function. 我想通过使用
dplyr's mutate
函数将两列彼此相乘。
But instead of writing a new line for each mutate conditions I would like to use the names of the columns stored in the vectors var1
and var2
. 但是,我不想为每个变异条件写新行,而是使用存储在向量
var1
和var2
中的列的名称。 For example in the end I want to have a additional column in my existing bankdata
with the name result1
which contains the result by multiplying the columns cash and loans with each other. 例如,最后,我想在我现有的
bankdata
有一个名为result1
的附加列,该列通过将现金和贷款列彼此相乘来包含结果。 This shall be continued until 3 new columns have been created. 这将继续直到创建了3个新列。
Reproducible code: 可复制的代码:
bankname <- c("Bank A", "Bank B", "Bank C", "Bank D", "Bank E")
bankid <- c(1, 2, 3, 4, 5)
year <- c(1881, 1881, 1881, 1881, 1881)
totass <- c(244789, 195755, 107736, 170600, 32000000)
cash <- c(7250, 10243, 13357, 35000, 351266)
bond <- c(20218, 185151, 177612, 20000, 314012)
loans <- c(29513, 2800, NA, 5000, NA)
bankdata <- data.frame(bankname, bankid, year, totass, cash, bond, loans)
Vectors var1 and var2 contain the column names I want to multiply ( cash*loans, bond*cash, loans*bankid
) and output is the name of the new column: 向量var1和var2包含我要相乘的列名(
cash*loans, bond*cash, loans*bankid
), 输出是新列的名称:
var1 <- c("cash", "bond", "loans")
var2 <- c("loans","cash", "bankid")
output <- c("result1", "result2", "result3")
I would like to do something similar like this: 我想做类似的事情:
bankdata %>%
mutate_at(.funs = funs(output = var1*var2), vars(var1, var2))
bankdata %>%
mutate_at(.funs = funs(result1 = cash*., result2 = bond*., result3 = loans*.), vars(var2))
Using tidyeval
approach , we build a function which can take strings as inputs then create new column. 使用
tidyeval
方法 ,我们构建了一个函数,该函数可以将字符串作为输入,然后创建新列。 Note the use of rlang::sym
and !!
注意使用
rlang::sym
和!!
(bang bang). (嘭嘭)。
After that we can use purrr::pmap_dfc
to loop through var1
, var2
to create new columns whose names supplied by output
之后,我们可以使用
purrr::pmap_dfc
遍历var1
, var2
来创建新列,其名称由output
提供
library(tidyverse)
bankname <- c("Bank A", "Bank B", "Bank C", "Bank D", "Bank E")
bankid <- c(1, 2, 3, 4, 5)
year <- c(1881, 1881, 1881, 1881, 1881)
totass <- c(244789, 195755, 107736, 170600, 32000000)
cash <- c(7250, 10243, 13357, 35000, 351266)
bond <- c(20218, 185151, 177612, 20000, 314012)
loans <- c(29513, 2800, NA, 5000, NA)
bankdata <- data.frame(bankname, bankid, year, totass, cash, bond, loans)
originalNames <- names(bankdata)
var1 <- c("cash", "bond", "loans")
var2 <- c("loans","cash", "bankid")
output <- c("result1", "result2", "result3")
my_mutate <- function(df, var1, var2, output) {
var1 <- rlang::sym(var1)
var2 <- rlang::sym(var2)
output <- rlang::sym(output)
df <- df %>%
mutate(!! output := !! var1 * !! var2)
return(df)
}
# test
my_mutate(bankdata, var1[1], var2[1], output[1])
#> bankname bankid year totass cash bond loans result1
#> 1 Bank A 1 1881 244789 7250 20218 29513 213969250
#> 2 Bank B 2 1881 195755 10243 185151 2800 28680400
#> 3 Bank C 3 1881 107736 13357 177612 NA NA
#> 4 Bank D 4 1881 170600 35000 20000 5000 175000000
#> 5 Bank E 5 1881 32000000 351266 314012 NA NA
# loop through 3 lists simultaneously
# keep only original and result* columns
pmap_dfc(list(var1, var2, output), ~ my_mutate(bankdata, ..1, ..2, ..3)) %>%
select(!! originalNames, starts_with("result"))
#> bankname bankid year totass cash bond loans result1 result2
#> 1 Bank A 1 1881 244789 7250 20218 29513 213969250 146580500
#> 2 Bank B 2 1881 195755 10243 185151 2800 28680400 1896501693
#> 3 Bank C 3 1881 107736 13357 177612 NA NA 2372363484
#> 4 Bank D 4 1881 170600 35000 20000 5000 175000000 700000000
#> 5 Bank E 5 1881 32000000 351266 314012 NA NA 110301739192
#> result3
#> 1 29513
#> 2 5600
#> 3 NA
#> 4 20000
#> 5 NA
Created on 2018-04-18 by the reprex package (v0.2.0). 由reprex软件包 (v0.2.0)于2018-04-18创建。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.