I am referencing the already answered question that has gotten me as close as possible: match / find rows based on multiple required values in a single row in R
Sample dataframe:
test <- data.frame(grp=c(1,1,2,2,2,3,3,3,4,4,4,4,4),val=c("C","I","E","I","C","E","I","A","C","I","E","E","A"))
I modified an answer to return only the grp values that match all criteria.
library('dplyr')
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>% filter(matching ==TRUE)
From here, I need to just return the number of grps that match the criteria, as a single numerical value that can be pasted into a single cell of a separate dataframe. I am trying to find matches for multiple different sets of criteria over the same data.frame. (ex. the number of groups that match the criteria AI and C; the number of groups that match the criteria E, A and I; the number of groups that match the criteria A, I and E; (etc.))
In the example, it returns a tibble:
A tibble: 1 x 2
grp matching
<dbl> <lgl>
1 4 TRUE
So there is one "grp" that matches the determined critera. I need to return that number: 1.
if my criteria is only the letter I, then I would want the code to return the number 4 , as all groups (1, 2, 3, and 4) match to the letter I.
If my criteria is the letter A, then I would want the code to return the number 2 , since only groups 3 and 4 match to the letter A
If we are looking for different combn
inations of 'val' to filter
, use combn
to returns combinations of the 'val' taken m = 3
at a time, grouped by 'grp', filter
the rows of 'test' where all
of the combinations are present in 'val', summarise
by paste
ing the sort
ed unique
values of 'val' and bind the list
to a single data.frame with bind_rows
library(dplyr)
combn(levels(test$val), 3, simplify = FALSE,
FUN = function(x)
test %>%
group_by(grp) %>%
filter(all(x %in% val)) %>%
summarise(out = toString(sort(unique(val))))) %>%
bind_rows
If we just want to get a single row as TRUE, after filter
ing the 'grp' based on the condition, summarise
by creating the matching
as TRUE
test %>%
group_by(grp) %>%
filter(all(c("A", "I", "C") %in% val)) %>%
summarise(matching = TRUE)
# A tibble: 1 x 2
# grp matching
# <dbl> <lgl>
#1 4 TRUE
Or switch the summarise
and filter
steps
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>%
filter(matching) %>%
pull(matching) %>%
sum
#[1] 1
Or can be made more compact
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>%
pull(matching) %>%
sum
#[1] 1
Or using base R
sum(!rowSums(table(test) == 0))
#[1] 1
First you filter the with your criteria, then you check which group is in all the letters you want. Maybe is not the best way to do it, but it works
criteria = c('A','I','C')
return = subset(test,test$val %in% criteria)
count = 0
for(group in unique(return$grp))
{
criteriaSum = sum(criteria %in% unique(return$val[return$grp == group]))
if(criteriaSum == length(criteria))
count = count + 1
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.