[英]R mutate multiple columns with ifelse
This is a similar problem to this ( R Mutate multiple columns with ifelse()-condition ), but I have trouble applying it to my problem.这是与此类似的问题( R Mutate multiple columns with ifelse()-condition ),但我无法将其应用于我的问题。
Here's a reproducible example:这是一个可重现的示例:
df <- structure(list(comm_id = c("060015", "060015", "060015", "060015",
"060015", "060015", "060015", "060015", "060015", "060015", "060015"
), trans_year = c(1992, 1993, 1994, 1995, 1996, 1997, 1998, 1999,
2000, 2001, 2002), f10_1 = c(1996, 1996, 1996, 1996, 1996, 1996,
1996, 1996, 1996, 1996, 1996), f10_2 = c(1997, 1997, 1997, 1997,
1997, 1997, 1997, 1997, 1997, 1997, 1997)), row.names = c(NA,
-11L), class = c("tbl_df", "tbl", "data.frame"))
I want to create additional columns (in my actual problem, more than 10 columns in a similar way) using ifelse
condition, which can be done as following with brute force.我想使用ifelse
条件创建额外的列(在我的实际问题中,以类似的方式超过 10 列),这可以通过蛮力来完成。 But my actual problem has more than 10 such columns, so it would benefit a lot from a more elegant approach.但是我的实际问题有 10 多个这样的列,所以它会从更优雅的方法中受益匪浅。
df %>%
mutate(post_f10_1 = ifelse(trans_year >= f10_1 & trans_year < f10_1 +5, 1, 0),
post_f10_2 = ifelse(trans_year >= f10_2 & trans_year < f10_2 +5, 1, 0))
I've tried a couple of different failed approaches as the following:我尝试了几种不同的失败方法,如下所示:
with base
,与base
,
n <- c(1:2)
df[paste0("post_f10_", n)] <- lapply(n, function(x)
ifelse(df$trans_year >= paste0("f10_", x) & df$trans_year < paste0("f10_", x) + 5, 1, 0))
# Error in paste0("f10_", x) + 5 : non-numeric argument to binary operator
with new across
function from tidyverse
来自tidyverse
的全新across
df %>%
mutate(across(starts_with("f10_"),
~ ifelse(trnas_year >= .x & trans_year < .x + 5, 1, 0), .names = "post_{col}"))
# Error: Problem with `mutate()` input `..1`.
# x object 'trnas_year' not found
# ℹ Input `..1` is `across(...)`.
The output I want looks like我想要的 output 看起来像
comm_id trans_year f10_1 f10_2 post_f10_1 post_f10_2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 060015 1992 1996 1997 0 0
2 060015 1993 1996 1997 0 0
3 060015 1994 1996 1997 0 0
4 060015 1995 1996 1997 0 0
5 060015 1996 1996 1997 1 0
6 060015 1997 1996 1997 1 1
7 060015 1998 1996 1997 1 1
8 060015 1999 1996 1997 1 1
9 060015 2000 1996 1997 1 1
10 060015 2001 1996 1997 0 1
11 060015 2002 1996 1997 0 0
If possible, I'd prefer tidyverse
approach.如果可能的话,我更喜欢tidyverse
方法。 Thanks!谢谢!
Update更新
My original tidyverse
approach did not work because of a typo.由于拼写错误,我最初tidyverse
方法不起作用。 So I update OP.所以我更新了OP。 Also, the answer below is much more elegant than what I post here.此外,下面的答案比我在这里发布的要优雅得多。
df %>%
+ mutate(across(starts_with("f10_"),
+ ~ ifelse(trans_year >= .x & trans_year < .x + 5, 1, 0), .names = "post_{col}"))
# A tibble: 11 x 6
comm_id trans_year f10_1 f10_2 post_f10_1 post_f10_2
<chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 060015 1992 1996 1997 0 0
2 060015 1993 1996 1997 0 0
3 060015 1994 1996 1997 0 0
4 060015 1995 1996 1997 0 0
5 060015 1996 1996 1997 1 0
6 060015 1997 1996 1997 1 1
7 060015 1998 1996 1997 1 1
8 060015 1999 1996 1997 1 1
9 060015 2000 1996 1997 1 1
10 060015 2001 1996 1997 0 1
11 060015 2002 1996 1997 0 0
You can use:您可以使用:
library(dplyr)
df %>%
mutate(across(starts_with("f10_"),
~as.integer(trans_year >= . & trans_year < (. + 5)),
.names = 'post_{col}'))
# comm_id trans_year f10_1 f10_2 post_f10_1 post_f10_2
# <chr> <dbl> <dbl> <dbl> <int> <int>
# 1 060015 1992 1996 1997 0 0
# 2 060015 1993 1996 1997 0 0
# 3 060015 1994 1996 1997 0 0
# 4 060015 1995 1996 1997 0 0
# 5 060015 1996 1996 1997 1 0
# 6 060015 1997 1996 1997 1 1
# 7 060015 1998 1996 1997 1 1
# 8 060015 1999 1996 1997 1 1
# 9 060015 2000 1996 1997 1 1
#10 060015 2001 1996 1997 0 1
#11 060015 2002 1996 1997 0 0
Or in base R with lapply
:或者在带有lapply
的基础 R 中:
cols <- paste0('f10_', 1:2)
df[paste0('post_', cols)] <- lapply(df[cols], function(x)
as.integer(df$trans_year >= x & df$trans_year < (x + 5)))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.