[英]troubleshooting case_when function using tidyverse in R
Simple question, what don't I understand about how case_when works.简单的问题,我不了解 case_when 的工作原理。 In the example below, I expected 4 levels in season but I get only two.
在下面的示例中,我预计赛季有 4 个级别,但我只得到了两个。
Thanks谢谢
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 | day <= 151 ~ "spring",
day > 151 | day <= 242 ~ "summer",
day > 242 | day <= 335 ~ "autumn"
)
)
The expressions 2 to 4 would be &
instead of |
表达式 2 到 4 将是
&
而不是|
. . Reason is that
|
原因是
|
will overwrite some of the values from the first condition because of overlap由于重叠,将覆盖第一个条件中的一些值
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day > 60 & day <= 151 ~ "spring",
day > 151 & day <= 242 ~ "summer",
day > 242 & day <= 335 ~ "autumn"
)
)
-checking -检查
> n_distinct(data$season)
[1] 4
actually you can reduce this case_when() statement a bit, because case_when breaks as soon as one condition is met.实际上,您可以稍微减少这个 case_when() 语句,因为一旦满足一个条件,case_when 就会中断。 So if the value is lower/equal to 60 or larger then 335, the next condition is suficiently definied with lower than 151:
因此,如果该值小于/等于 60 或大于 335,则下一个条件充分定义为小于 151:
library(dplyr)
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 | day > 335 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn"
)
)
also you can make use of the TRUE case as it is used when all prior conditions are not met:您也可以使用 TRUE 案例,因为它在不满足所有先验条件时使用:
data <- tibble(day = 1:366) %>%
mutate(
season = case_when(
day <= 60 ~ "winter",
day <= 151 ~ "spring",
day <= 242 ~ "summer",
day <= 335 ~ "autumn",
TRUE ~ "winter"
)
)
Stop using case_when
and use cut
instead.停止使用
case_when
并改用cut
。
tibble(day = 1:366) |>
mutate(
season = cut(day,
c(0, 60, 151, 242, 335, 366),
c("winter", "spring", "summer", "autumn",
"winter")
)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.