简体   繁体   中英

return only the count of rows that match multiple criteria at once in r

I am referencing the already answered question that has gotten me as close as possible: match / find rows based on multiple required values in a single row in R

Sample dataframe:

test <- data.frame(grp=c(1,1,2,2,2,3,3,3,4,4,4,4,4),val=c("C","I","E","I","C","E","I","A","C","I","E","E","A"))

I modified an answer to return only the grp values that match all criteria.

library('dplyr')
test %>%
  group_by(grp) %>%
  summarise(matching = all(c("A", "I", "C") %in% val)) %>% filter(matching ==TRUE)

From here, I need to just return the number of grps that match the criteria, as a single numerical value that can be pasted into a single cell of a separate dataframe. I am trying to find matches for multiple different sets of criteria over the same data.frame. (ex. the number of groups that match the criteria AI and C; the number of groups that match the criteria E, A and I; the number of groups that match the criteria A, I and E; (etc.))

In the example, it returns a tibble:

A tibble: 1 x 2
    grp matching
  <dbl> <lgl>   
1     4 TRUE

So there is one "grp" that matches the determined critera. I need to return that number: 1.

if my criteria is only the letter I, then I would want the code to return the number 4 , as all groups (1, 2, 3, and 4) match to the letter I.

If my criteria is the letter A, then I would want the code to return the number 2 , since only groups 3 and 4 match to the letter A

If we are looking for different combn inations of 'val' to filter , use combn to returns combinations of the 'val' taken m = 3 at a time, grouped by 'grp', filter the rows of 'test' where all of the combinations are present in 'val', summarise by paste ing the sort ed unique values of 'val' and bind the list to a single data.frame with bind_rows

library(dplyr)
combn(levels(test$val), 3, simplify = FALSE,
     FUN = function(x)
      test %>%
         group_by(grp) %>%
         filter(all(x  %in% val)) %>% 
         summarise(out = toString(sort(unique(val))))) %>% 
  bind_rows

Update

If we just want to get a single row as TRUE, after filter ing the 'grp' based on the condition, summarise by creating the matching as TRUE

test %>%
     group_by(grp) %>%
     filter(all(c("A", "I", "C") %in% val)) %>%
     summarise(matching = TRUE)
# A tibble: 1 x 2
#    grp matching
#  <dbl> <lgl>   
#1     4 TRUE  

Or switch the summarise and filter steps

test %>% 
   group_by(grp) %>% 
   summarise(matching = all(c("A", "I", "C") %in% val)) %>% 
   filter(matching)  %>%
   pull(matching) %>%
   sum 
#[1] 1

Or can be made more compact

test %>% 
    group_by(grp) %>%
    summarise(matching = all(c("A", "I", "C") %in% val)) %>% 
    pull(matching) %>% 
    sum
 #[1] 1

Or using base R

sum(!rowSums(table(test) == 0))
#[1] 1

First you filter the with your criteria, then you check which group is in all the letters you want. Maybe is not the best way to do it, but it works

criteria = c('A','I','C')
return = subset(test,test$val %in% criteria)
count = 0

for(group in unique(return$grp))
{
  criteriaSum =  sum(criteria %in% unique(return$val[return$grp == group]))
  if(criteriaSum == length(criteria))
    count = count + 1
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM