[英]return only the count of rows that match multiple criteria at once in r
I am referencing the already answered question that has gotten me as close as possible: match / find rows based on multiple required values in a single row in R我正在引用已经回答的问题,该问题使我尽可能接近: 匹配/查找基于 R 中单行中多个必需值的行
Sample dataframe:示例数据框:
test <- data.frame(grp=c(1,1,2,2,2,3,3,3,4,4,4,4,4),val=c("C","I","E","I","C","E","I","A","C","I","E","E","A"))
I modified an answer to return only the grp values that match all criteria.我修改了一个答案,只返回符合所有条件的 grp 值。
library('dplyr')
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>% filter(matching ==TRUE)
From here, I need to just return the number of grps that match the criteria, as a single numerical value that can be pasted into a single cell of a separate dataframe.从这里开始,我只需要返回符合条件的 grps 数量,作为可以粘贴到单独数据帧的单个单元格中的单个数值。 I am trying to find matches for multiple different sets of criteria over the same data.frame.
我试图在同一个 data.frame 上找到多组不同标准的匹配项。 (ex. the number of groups that match the criteria AI and C; the number of groups that match the criteria E, A and I; the number of groups that match the criteria A, I and E; (etc.))
(例如,符合条件 AI 和 C 的组数;符合条件 E、A 和 I 的组数;符合条件 A、I 和 E 的组数;(等等))
In the example, it returns a tibble:在示例中,它返回一个小标题:
A tibble: 1 x 2
grp matching
<dbl> <lgl>
1 4 TRUE
So there is one "grp" that matches the determined critera.所以有一个“grp”与确定的标准相匹配。 I need to return that number: 1.
我需要返回那个数字:1。
if my criteria is only the letter I, then I would want the code to return the number 4 , as all groups (1, 2, 3, and 4) match to the letter I.如果我的标准只是字母 I,那么我希望代码返回数字4 ,因为所有组(1、2、3 和 4)都与字母 I 匹配。
If my criteria is the letter A, then I would want the code to return the number 2 , since only groups 3 and 4 match to the letter A如果我的标准是字母 A,那么我希望代码返回数字2 ,因为只有第 3 组和第 4 组与字母 A 匹配
If we are looking for different combn
inations of 'val' to filter
, use combn
to returns combinations of the 'val' taken m = 3
at a time, grouped by 'grp', filter
the rows of 'test' where all
of the combinations are present in 'val', summarise
by paste
ing the sort
ed unique
values of 'val' and bind the list
to a single data.frame with bind_rows
如果我们正在寻找 'val' 的不同
combn
来filter
,使用combn
返回一次取m = 3
的 'val' 组合,按 'grp' 分组, filter
'test' 的行,其中all
的组合存在于 'val' 中,通过paste
'val' 的sort
unique
值进行summarise
, paste
list
绑定到具有bind_rows
的单个bind_rows
library(dplyr)
combn(levels(test$val), 3, simplify = FALSE,
FUN = function(x)
test %>%
group_by(grp) %>%
filter(all(x %in% val)) %>%
summarise(out = toString(sort(unique(val))))) %>%
bind_rows
If we just want to get a single row as TRUE, after filter
ing the 'grp' based on the condition, summarise
by creating the matching
as TRUE如果我们只想获得单行为 TRUE,在根据条件
filter
'grp' 后,通过将matching
创建为 TRUE 进行summarise
test %>%
group_by(grp) %>%
filter(all(c("A", "I", "C") %in% val)) %>%
summarise(matching = TRUE)
# A tibble: 1 x 2
# grp matching
# <dbl> <lgl>
#1 4 TRUE
Or switch the summarise
and filter
steps或者切换
summarise
和filter
步骤
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>%
filter(matching) %>%
pull(matching) %>%
sum
#[1] 1
Or can be made more compact或者可以做得更紧凑
test %>%
group_by(grp) %>%
summarise(matching = all(c("A", "I", "C") %in% val)) %>%
pull(matching) %>%
sum
#[1] 1
Or using base R
或使用
base R
sum(!rowSums(table(test) == 0))
#[1] 1
First you filter the with your criteria, then you check which group is in all the letters you want.首先你用你的标准过滤,然后你检查哪个组在你想要的所有字母中。 Maybe is not the best way to do it, but it works
也许不是最好的方法,但它有效
criteria = c('A','I','C')
return = subset(test,test$val %in% criteria)
count = 0
for(group in unique(return$grp))
{
criteriaSum = sum(criteria %in% unique(return$val[return$grp == group]))
if(criteriaSum == length(criteria))
count = count + 1
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.