I'm trying to create an extra column based on multiple conditions in R.
My current data frame is this:
df = data.frame(
var1 = c(0, 4, 8, 2, 4, 10, 2, 3, 2, 9),
var2 = c(9, 10, 5, 4, 7, 8, 6, 9, 7, 2),
var3 = c(3, 3, 5, 5, 4, 5, 5, 2, 2, 1))
df
I want the new column to be binary "Yes" or "No", where "Yes" will be if the case has above average mean for both var1 and either var2 or var3
What I've tried so far:
mean(df$var1, na.rm = TRUE)
mean(df$var2, na.rm = TRUE)
mean(df$var3, na.rm = TRUE)
These give 4.4, 6.7, and 3.5 respectively.
Then to create the new column, I try:
library(dplyr)
df %>%
dplyr::mutate(
newvar = ifelse(
var1 > 4.4 &
(var2 > 6.7 |
var3 > 3.5),
"Yes",
"No")
I'm not sure how to state this correctly - I realise that what I have above is incorrect (gives an error "argument "yes" is missing, with no default"), but I'm unsure of how to approach this and would really appreciate any advice.
Thanks.
You were very close - you just need to wrap in brackets the condition for ifelse()
.
df |>
mutate(
newvar = ifelse(
(
var1 > mean(var1) &
(var2 > mean(var2) | var3 > mean(var3))
),
"Yes",
"No"
)
)
# var1 var2 var3 newvar
# 1 0 9 3 No
# 2 4 10 3 No
# 3 8 5 5 Yes
# 4 2 4 5 No
# 5 4 7 4 No
# 6 10 8 5 Yes
# 7 2 6 5 No
# 8 3 9 2 No
# 9 2 7 2 No
# 10 9 2 1 No
We can use if_all/if_any
library(dplyr)
df <- df %>%
mutate(newvar = case_when(if_any(var2:var3, ~
.x > mean(.x))& var1 > mean(var1) ~ "Yes", TRUE ~ "No"))
-output
df
var1 var2 var3 newvar
1 0 9 3 No
2 4 10 3 No
3 8 5 5 Yes
4 2 4 5 No
5 4 7 4 No
6 10 8 5 Yes
7 2 6 5 No
8 3 9 2 No
9 2 7 2 No
10 9 2 1 No
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.