简体   繁体   中英

How to delete rows based on a condition on factor variable in R

I have a dataset with multiple variables and a factor variable that has the name of the segment like Seg1, Seg2, Seg3, etc.

The names of the segments are unique and there is a total of 11 segments and I would like to drop the records of a few segments.

Since this is a factor variable, how do I drop the records that meet the criteria. The levels I want to drop are 1,2,5,6,7. Please help me in how do I drop the rows where the segments are of these levels.

df[as.numeric(df$var) %in% c(1,2,5,6,7), ]

Since you didn't provide data, but described you data in a way that was understandable I will provide a general answer.

# Generate a factor variable
# to provide an example.
mtcars <- mtcars %>% 
        mutate(
                cyl_factor = as.factor(
                        paste(cyl, "cyl")
                )
        )

I would convert the factor to character in this process, to avoid dealing with missing levels , so

# Convert factor to character, to 
# avoid factor levels confusions. And 
# filter accordingly.
mtcars <- mtcars %>% 
        mutate(
                cyl_factor = as.character(
                        cyl_factor
                )
        ) %>% filter(
                !(cyl_factor %in% c("6 cyl", "4 cyl"))
        )

Where c("6 cyl", "4 cyl") will be the levels of the factors that you want to exclude from your data.

NOTE: Please see How to make a great R reproducible example and https://stackoverflow.com/help/how-to-ask for future posts here on Stackoverflow, please.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM