I have a dataframe in R, in which I want to remove all rows of a particular group if two or more specific groups are present. In the example below, I want to remove all the rows related to the 'bil berry', if both bil berry and blackberry are present. I got to the point where I can identify whether my data has two or more kinds of berries, but I am not sure about the next steps. I prefer a solution with dplyr.
library(stringr)
library(dplyr)
data(fruit)
my.df <- data.frame(
"Name" = rep(fruit[1:7], each = 2),
"Value" = 1:14
)
UniqueFruits <- unique(my.df$Name)
sum(grepl("berry", UniqueFruits))>1
Maybe you are trying for:
library(dplyr)
unique_berries <- grep('berry', my.df$Name, value = TRUE)
if(n_distinct(unique_berries) > 1) my.df <- my.df %>% filter(Name != 'bilberry')
my.df
# Name Value
#1 apple 1
#2 apple 2
#3 apricot 3
#4 apricot 4
#5 avocado 5
#6 avocado 6
#7 banana 7
#8 banana 8
#9 bell pepper 9
#10 bell pepper 10
#11 blackberry 13
#12 blackberry 14
So here is a way that is "pure" dplyr
library(stringr)
library(dplyr)
data(fruit)
my.df <- data.frame(
"Name" = rep(fruit[1:7], each = 2),
"Value" = 1:14
)
my.df %>%
mutate( keepMe = case_when(
length (unique (grepl("berry", Name))) >0 & Name == "bilberry" ~ FALSE,
TRUE ~ TRUE)
) %>%
filter( keepMe != F )
Has no IF statements as such. Not sure I really like it! But it is what you asked for - a tidyverse solution
Like this?
my.df %>%
mutate( berry = grepl("berry", Name)) %>%
filter( berry == F )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.