简体   繁体   中英

How to keep all rows of one group if two specified values are present in one column in data.frame in R

I have a dataframe in R, in which I want to remove all rows of a particular group if two or more specific groups are present. In the example below, I want to remove all the rows related to the 'bil berry', if both bil berry and blackberry are present. I got to the point where I can identify whether my data has two or more kinds of berries, but I am not sure about the next steps. I prefer a solution with dplyr.

library(stringr)
library(dplyr)

data(fruit)

my.df <- data.frame(
"Name" = rep(fruit[1:7], each = 2), 
"Value" = 1:14
)

UniqueFruits <- unique(my.df$Name)
sum(grepl("berry", UniqueFruits))>1

Maybe you are trying for:

library(dplyr)

unique_berries <- grep('berry', my.df$Name, value = TRUE)
if(n_distinct(unique_berries) > 1) my.df <- my.df %>% filter(Name != 'bilberry')

my.df

#          Name Value
#1        apple     1
#2        apple     2
#3      apricot     3
#4      apricot     4
#5      avocado     5
#6      avocado     6
#7       banana     7
#8       banana     8
#9  bell pepper     9
#10 bell pepper    10
#11  blackberry    13
#12  blackberry    14

So here is a way that is "pure" dplyr

library(stringr)
library(dplyr)
data(fruit)
my.df <- data.frame(
"Name" = rep(fruit[1:7], each = 2), 
"Value" = 1:14
)

my.df %>%
  mutate( keepMe = case_when(
    length (unique (grepl("berry", Name))) >0 & Name == "bilberry" ~ FALSE,
    TRUE ~ TRUE) 
  ) %>%
  filter( keepMe != F )

Has no IF statements as such. Not sure I really like it! But it is what you asked for - a tidyverse solution

Like this?

my.df %>%
mutate( berry = grepl("berry", Name)) %>%
filter( berry == F )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM