[英]if_else statement not calculating mutation values correctly
#使用數據分布來確定(描述性)數據集中的 position
代碼如下:
jobs_df <- jobs_df %>%
mutate(description = if_else(quan_value < 'q1' , "Lowest",
if_else(quan_value < 'q2', "Low",
if_else(quan_value < 'q3' , "Medium",
if_else(quan_value < 'q4' , "High",
if_else(quan_value < 'q5', "Highest", NA_character_))))))
其中 dataframe 中每一行的“描述”應為最低、低、中、高、最高,q1、q2、q3、q4、q5 指的是“quan_value”列數據分布的五分位數
dataframe如下(jobs_df):
jobs quan_value q1 q2 q3 q4 q5 <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 Banker 1.3 2 4 6 8 1 2 Accountant 2.4 2 4 6 8 1 3 Waiter 4.2 2 4 6 8 1 4 Barista 6.3 2 4 6 8 1 5 Train driver 9.1 2 4 6 8 1
“description”是我想要基於 if_else 語句的新列,但它主要只是返回“Medium”作為結果
每當我看到超過 2 個嵌套if_else
(或ifelse
或fifelse
)時,我傾向於case_when
:
jobs_df %>%
mutate(description = case_when(
quan_value < q1 ~ "Lowest",
quan_value < q2 ~ "Low",
quan_value < q3 ~ "Medium",
quan_value < q4 ~ "High",
quan_value < q5 ~ "Highest",
TRUE ~ NA_character_)
)
# jobs quan_value q1 q2 q3 q4 q5 description
# 1 Banker 1.3 2 4 6 8 1 Lowest
# 2 Accountant 2.4 2 4 6 8 1 Low
# 3 Waiter 4.2 2 4 6 8 1 Medium
# 4 Barista 6.3 2 4 6 8 1 High
# 5 Train driver 9.1 2 4 6 8 1 <NA>
數據
jobs_df <- structure(list(jobs = c("Banker", "Accountant", "Waiter", "Barista", "Train driver"), quan_value = c(1.3, 2.4, 4.2, 6.3, 9.1), q1 = c(2L, 2L, 2L, 2L, 2L), q2 = c(4L, 4L, 4L, 4L, 4L), q3 = c(6L, 6L, 6L, 6L, 6L), q4 = c(8L, 8L, 8L, 8L, 8L), q5 = c(1L, 1L, 1L, 1L, 1L)), row.names = c("1", "2", "3", "4", "5"), class = "data.frame")
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.