简体   繁体   中英

How to use mutate and ifelse to convert numerical variables into factor variable with multiple levels

I'm having difficulty creating a new factor variable from a preexisting numerical variable. I have a numerical variable Age with the age of my participants but want to create a factor variable that categorizes participants' age into different categories. Whenever I run my code I get an error:

"Error: argument "no" is missing, with no default."

I have tried different variations of the below code such as the new factor level without quotes, using : for ranges, etc. My code is below.

data.frame%>%
    mutate(Age = ifelse(Age < 20, "0"),
           ifelse(Age >= 20 & Age <= 29, "1"),
                  ifelse(Age >=30 & Age <= 39, "2"),
                        ifelse(Age >= 40 & Age <=49, "3"),
                               ifelse(Age >= 50 & Age <= 59, "4"),
                                     ifelse(Age >= 60 & Age <= 69, "5"),
                                           ifelse(Age >= 70, "6", NA))

cut() is the easiest way to do this.

In base R:

Age <- seq(10,80,by=10)
cut(Age,breaks=c(-Inf,seq(20,70,by=10),Inf),
        right=FALSE,
        labels=as.character(0:6))

I'll leave you to embed this in mutate() as you like.

The problem with your code is that you don't have the choices nested properly: compare this snippet carefully to your code ...

Age = ifelse(Age < 20, "0",
         ifelse(Age >= 20 & Age <= 29, "1",
            ifelse(...,[yes],[no])))

The end brackets ")" should go to the end of all ifelse :

df1 <- data.frame(Age=c(1:80,NA))

df1%>%
    mutate(Age_cat = factor(ifelse(Age < 20, "0",
           ifelse(Age >= 20 & Age <= 29, "1",
                  ifelse(Age >=30 & Age <= 39, "2",
                        ifelse(Age >= 40 & Age <=49, "3",
                               ifelse(Age >= 50 & Age <= 59, "4",
                                     ifelse(Age >= 60 & Age <= 69, "5",
                                           ifelse(Age >= 70, "6", NA)))))))))

However, you should also know that in dplyr , this is the perfect opportunity for case_when :

df1 %>%
mutate(Age_cat= factor(case_when(
  .$Age <  20 ~ "0",
  .$Age >= 20 & .$Age <= 29 ~ "1",
  .$Age >= 30 & .$Age <= 39 ~"2",
  .$Age >= 40 & .$Age <=49 ~  "3",
  .$Age >= 50 & .$Age <= 59 ~ "4",
  .$Age >= 60 & .$Age <= 69 ~ "5",
  TRUE  ~"6"))
)
   Age Age_cat
1    1       0
2    2       0
3    3       0
4    4       0
5    5       0
...
13  13       0
14  14       0
15  15       0
16  16       0
17  17       0
18  18       0
19  19       0
20  20       1
21  21       1
22  22       1
23  23       1
24  24       1
...
79  79       6
80  80       6
81  NA    <NA>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM