[英]R dplyr mutate: creating one variable from multiple column variables using "or" logic
[英]creating new variables from multiple variable using mutate() and across() in dplyr 1.0.0
我需要以相同的方式將具有相同前綴的多個列全部變異為新列。
這是玩具數據
df <- data.frame(su_1 = round(rnorm(12),2),
su_2 = round(rnorm(12),2),
su_3 = round(rnorm(12),2))
現在說我想將每個變量的連續值排序到離散的 bin 中。 我可以像這樣對每一列使用三個獨立的類似步驟
df %>% mutate(su_1_disc = ifelse(su_1 < 0, "less",
ifelse(su_1 > 0 & su_1 <= 0.5, "mid", "lots"))) -> df
df %>% mutate(su_2_disc = ifelse(su_2 < 0, "less",
ifelse(su_2 > 0 & su_2 <= 0.5, "mid", "lots"))) -> df
df %>% mutate(su_3_disc = ifelse(su_3 < 0, "less",
ifelse(su_3 > 0 & su_3 <= 0.5, "mid", "lots"))) -> df
df
# output
# su_1 su_2 su_3 su_1_disc su_2_disc su_3_disc
# 1 1.99 0.77 -0.17 lots lots less
# 2 0.51 -0.76 -1.24 lots less less
# 3 1.50 -0.36 0.28 lots less mid
# 4 0.86 0.88 -0.52 lots lots less
# 5 0.08 0.63 -0.76 mid lots less
# 6 -0.51 -0.99 0.01 less less mid
# 7 0.35 1.59 0.19 mid lots mid
# 8 0.16 0.35 0.38 mid mid mid
# 9 -0.75 -0.45 1.75 less less lots
# 10 0.97 0.62 -0.05 lots lots less
# 11 -0.07 0.47 -0.24 less mid less
# 12 0.61 -0.27 -1.55 lots less less
但我想使用新的 dplyr 1.0.0 功能一步完成
我試過了
df %>%
mutate(across(starts_with("su_"),
ifelse(.x < 0, "less",
ifelse(.x > 0 & .x <= 0.5, "mid", "lots"))))
但它拋出了一個錯誤。 我知道.names
需要在某個地方加入,但我有點迷茫。
您可以使用 -
library(dplyr)
df %>%
mutate(across(starts_with("su_"),~ifelse(.x < 0, "less",
ifelse(.x > 0 & .x <= 0.5, "mid", "lots")), .names = '{col}_disc'))
# su_1 su_2 su_3 su_1_disc su_2_disc su_3_disc
#1 0.40 0.57 -0.11 mid lots less
#2 1.82 -0.55 0.44 lots less mid
#3 0.44 1.47 -0.39 mid lots less
#4 -0.82 0.00 -0.12 less lots less
#5 0.17 -0.10 -1.55 mid less less
#6 0.20 0.98 -1.02 mid lots less
#7 -0.01 1.12 -0.30 less lots less
#8 -0.70 0.31 0.35 less mid mid
#9 0.46 1.18 -0.22 mid lots less
#10 -1.09 0.03 -0.85 less mid less
#11 -0.03 1.81 1.28 less lots lots
#12 -0.11 1.64 -0.51 less lots less
您還可以將ifelse
替換為case_when
或cut
。
考慮使用case_when
而不是嵌套的ifelse
library(dplyr)
df %>%
mutate(across(starts_with("su_"), ~ case_when(. < 0 ~ "less",
between(., 0, 0.5) ~ "mid", TRUE ~ "lots"),
.names = "{.col}_disc"))
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.