I'm having trouble with creating a new variable with selected levels of another variable. The data set is gss and the variable is class which has 5 levels "Lower Class" "Working Class" "Middle Class" "Upper Class" "No Class" and NA
If I run,
gss %>%
select(class) %>%
str()
It gives me
'data.frame': 57061 obs. of 1 variable:
$ class: Factor w/ 5 levels "Lower Class",..: 3 3 2 3 2 3 3 2 2 2 ...
Since I am only interested in those who specified their economic class, I would like to take out "No Class" level and NA. I do not know any better way to do this so I did
gss <- gss %>%
mutate(filteredclass = ifelse(class == "Lower Class", "Lower Class",
ifelse(class == "Working Class", "Working Class", ifelse(class == "Middle
Class", "Middle Class", ifelse(class == "Upper Class", "Upper Class", NA)))))
Then, I tried to see whether it worked or not, so I ran:
with (gss, table(filteredclass))
Which then gave me with mixed order as below:
filteredclass
Lower Class Middle Class Upper Class Working Class
3147 24289 1741 24458
I would want the new variable filteredclass to be shown as the same order as the variable 'class'. Since if I do the same with the variable 'class' it gives me:
with (gss, table(class))
class
Lower Class Working Class Middle Class Upper Class
3147 24458 24289 1741
No Class
1
Is there any way I can fix this? Or also, is there any way I can take out No Class level without going through mutate command I did above?
Thanks for your help in advance!
In the future, its much easier if you provide a reproducible example .
If you want to get rid of "No Class" you can use filter
gss <- gss %>%
filter(class != "No Class") %>%
droplevels()
To remove NAs just use
gss <- na.omit(gss)
Easiest way could to be factor
on class as:
gss$filteredclass <- factor(gss$class, c("Lower Class", "Working Class",
"Middle Class", "Upper Class"))
This will omit "No class" and set it as NA
.
You have to relevel the factor with the same order as gss$class
. To do this you can add another line to your mutate()
statement where you create the factor with the same levels and drop unused levels (No Class).
library(tidyverse)
# Generate the data you showed
gss <- data.frame(class = factor(sample(c("Lower Class", "Working Class", "Middle Class", "Upper Class", NA, "No Class"),
45000, replace = TRUE))) %>%
mutate(class = factor(class, levels = c("Lower Class", "Working Class", "Middle Class", "Upper Class", "No Class", NA)))
# Sampled data
with(gss, table(class, useNA = "always"))
# Mutate gss the way you did it
gss <- gss %>%
mutate(filteredclass = ifelse(class == "Lower Class", "Lower Class",
ifelse(class == "Working Class", "Working Class",
ifelse(class == "Middle Class", "Middle Class",
ifelse(class == "Upper Class", "Upper Class", NA)))),
# Then make filteredclass into a factor with the same levels as class
# Use droplevels() to remove unused classes (since we removed the No Class)
filteredclass = droplevels(factor(filteredclass, levels = levels(class))))
with(gss, table(class))
with(gss, table(filteredclass))
The output is this,
> with(gss, table(class, useNA = "always"))
class
Lower Class Working Class Middle Class Upper Class No Class
7362 7469 7626 7450 7457
<NA>
7636
> with(gss, table(class))
class
Lower Class Working Class Middle Class Upper Class No Class
7362 7469 7626 7450 7457
> with(gss, table(filteredclass))
filteredclass
Lower Class Working Class Middle Class Upper Class
7362 7469 7626 7450
A much quicker way is to use droplevels()
instead of the chain of ifelse()
statements
# Filter/remove obs where class is No Class or NA
with(gss %>% mutate(filteredclass = droplevels(class, exclude = c(NA, "No Class"))),
table(filteredclass))
filteredclass
Lower Class Working Class Middle Class Upper Class
7362 7469 7626 7450
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.