简体   繁体   中英

Categorize categorical data R

I am trying to categorize a column of categorical data

 current        |desired 
| DN974         | a   | 
| DN469B        | a   |
| DN469W;DN469E | b   |
| DN80          | b   | 
| EZDH01        | c   |
| DN971         | c   |

What I tried:

a <- c("DN974", "DN469B", "DN469W;DN469E", "DN80", "EZDH01", "DN971")
df <- data.frame(a)
df <- mutate(df, a=if_else((a=="DN974" | a=="DN469B"), "a", 
                          (if_else(a=="DN469W;DN469E" | a=="DN80"), "b", "c")))

I am trying with the if_else function, but i fail to make it work. I get the error Error: unexpected ',' in: "df <- mutate(df, a=if_else((a=="DN974" | a=="DN469B"), "a", (if_else(a=="DN469W;DN469E" | a=="DN80")

Am I using the correct function, and is so, what am I doing wrong? Thx

Nested ifelse 's are a bad idea because they tend to be unreadable and bugs become easier to occur. Use case_when .

suppressPackageStartupMessages(library(dplyr))

a <- c("DN974", "DN469B", "DN469W;DN469E", "DN80", "EZDH01", "DN971")
df <- data.frame(a)

df <- df %>%
  mutate(a = case_when(
    a %in% c("DN974","DN469B") ~ "a",
    a %in% c("DN469W;DN469E", "DN80") ~ "b",
    TRUE ~ "c"
  ))

df
#>   a
#> 1 a
#> 2 a
#> 3 b
#> 4 b
#> 5 c
#> 6 c

Created on 2022-04-25 by the reprex package (v2.0.1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM