简体   繁体   中英

Remove the last dummy of a character or factor variable in R

I borrowed a little example from here

df <- data.frame(letter = rep(c('a', 'b', 'c'), each = 2), y = 1:6)
library(caret)
dummy <- dummyVars(~ ., data = df, fullRank = TRUE, sep = "_")
head(predict(dummy, df))

##    letter_b letter_c y
##  1        0        0 1
##  2        0        0 2
##  3        1        0 3
##  4        1        0 4
##  5        0        1 5
##  6        0        1 6

However, it gives a dataframe where the first dummy of the factor variable letter_a is removed.

I also have tried the fastDummies::dummy_cols as follows:

head(fastDummies::dummy_cols(df, remove_selected_columns=TRUE, remove_first_dummy=TRUE))

    ##     y letter_b letter_c
##  1  1        0        0
##  2  2        0        0
##  3  3        1        0
##  4  4        1        0
##  5  5        0        1
##  6  6        0        1

but it only has a remove_first_dummy=TRUE argument with also removing letter_a . How can one remove the last dummy of the factor variable letter_c in R in a concise and convenient way?

You can use relevel to set the reference to be the last dummy (in this case c ):

library(caret)
df <- data.frame(letter = rep(c('a', 'b', 'c'), each = 2), y = 1:6)
df$letter <- relevel(factor(df$letter),ref = "c")
dummy <- dummyVars(~ ., data = df, fullRank = TRUE, sep = "_")
head(predict(dummy,df))

  letter_a letter_b y
1        1        0 1
2        1        0 2
3        0        1 3
4        0        1 4
5        0        0 5
6        0        0 6

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM