R: using dplyr to remove certain rows in the data.frame

Question

dat <- data.frame(ID = c(1, 2, 2, 2), Gender = c("Both", "Both", "Male", "Female"))
> dat
  ID Gender
1  1   Both
2  2   Both
3  2   Male
4  2 Female

For each ID, if the Gender is Both , Male , and Female , I want to remove the row with Both . That is, my desired data is this:

  ID Gender
1  1   Both
2  2   Male
3  2 Female

I've tried to do this by using the code below:

library(dplyr)
> dat %>% 
  group_by(ID) %>% 
  mutate(A = ifelse(length(unique(Gender)) >= 3 & Gender == 'Both', F, T)) %>% 
  filter(A) %>% 
  select(-A)

# A tibble: 2 x 2
# Groups:   ID [1]
     ID Gender
  <dbl> <fctr>
1     2   Male
2     2 Female

I'm declaring a dummy variable called A , where A = F if for a given ID , all 3 elements of Gender are present ("Both", "Male", and "Female"; these are the different values that Gender can take, no other value is possible) and the corresponding row has Gender == Both . Then I will remove that row.

However, it seems like I'm assigning A = F to the first row, even though its Gender is only "Both", but not "Both", "Male", and "Female"?

Answer 1

After grouping by 'ID', create a logical condition where the 'Gender' is not 'Both' and the length of distinct elements in 'Gender' is 3 ie 'Male', 'Female', 'Both' (as the OP mentioned there is no other values) or ( | ) if the number of elements is only 1

dat %>% 
  group_by(ID) %>% 
  filter((Gender != "Both" & n_distinct(Gender)==3)| n() ==1 )
# A tibble: 3 x 2
# Groups:   ID [2]
#    ID Gender
#  <dbl> <fct> 
#1     1 Both  
#2     2 Male  
#3     2 Female

Or another option is

dat %>%
   group_by(ID) %>% 
   filter(Gender %in% c("Male", "Female")| n() == 1)
# A tibble: 3 x 2
# Groups:   ID [2]
#     ID Gender
#  <dbl> <fct> 
#1     1 Both  
#2     2 Male  
#3     2 Female

Answer 2

From base R , using ave

dat[!(ave(dat$Gender,dat$ID,FUN=function(x) length(unique(x)))!='1'&(dat$Gender=='Both')),]
  ID Gender
1  1   Both
3  2   Male
4  2 Female

R: using dplyr to remove certain rows in the data.frame

Question

2 answers

solution1
2 ACCPTED 2018-06-10 23:29:14

solution2
1 2018-06-11 00:39:06

R: using dplyr to remove certain rows in the data.frame

Question

2 answers

solution1 2 ACCPTED 2018-06-10 23:29:14

solution2 1 2018-06-11 00:39:06

solution1
2 ACCPTED 2018-06-10 23:29:14

solution2
1 2018-06-11 00:39:06