[英]Count rows based on multiple criteria
I have a simple question, but I don't know how to solve this... I have two matrix and i'm trying to create a column in the first one that represents the number of times a row in the second one matches a set of criteria. 我有一个简单的问题,但我不知道如何解决这个...我有两个矩阵,我试图在第一个中创建一个列,表示第二个中的一行匹配一个列的次数一套标准。 For example, imagine I have Matrix A 例如,假设我有Matrix A
Ad1 Ad2 Ad3 Ad4
AA 101 0 10
AA 101 10 12
AA 101 12 15
AA 101 15 20
AA 300 0 100
AA 300 100 230
AA 300 230 300
...
and matrix B is 和矩阵B是
Bd1 Bd2 Bd3
AA 101 0
AA 101 1
AA 101 2
AA 101 4
AA 101 5
...
AB 102 1
AB 102 10
...
and I would like two create a fifth column in A with the count of the number of rows in B that matches the following condition (for each row of A): 我希望两个在A中创建第五列,其中B中的行数与下列条件匹配(对于A的每一行):
(A$Ad1==B$Bd1) & (A$Ad2==B$Bd2) & (A$Ad3<=B$Bd3) & (A$Ad4>B$Bd3)
Is there a way to perform this without creating a loop for each row of A? 有没有办法在不为A的每一行创建循环的情况下执行此操作?
The factor nature of the first column can get in the way so using either as.character or %in% is needed for the first comparison: 第一列的因子性质可能会受到影响,因此第一次比较需要使用as.character或%in%:
A = read.table(text="Ad1 Ad2 Ad3 Ad4
AA 101 0 10
AA 101 10 12
AA 101 12 15
AA 101 15 20
AA 300 0 100
AA 300 100 230
AA 300 230 300", header=TRUE)
B = read.table(text=" Bd1 Bd2 Bd3
AA 101 0
AA 101 1
AA 101 2
AA 101 4
AA 101 5
AB 102 1
AB 102 10", header=TRUE)
> with( A, mapply(function(x,y,z,z2){sum((x %in% B$Bd1) & (y == B$Bd2) &
(z <= B$Bd3) & (z2 > B$Bd3) )},
Ad1, Ad2, Ad3, Ad4) )
[1] 5 0 0 0 0 0 0
> with( A, mapply(function(x,y,z,z2){sum((as.character(x) == B$Bd1) & (y == B$Bd2) &
(z <= B$Bd3) & (z2 > B$Bd3) )},
Ad1, Ad2, Ad3, Ad4) )
[1] 5 0 0 0 0 0 0
This is the error that gets thrown with the use of ==
这是使用==
抛出的错误
> factor("a", levels=c("a","b")) == factor("a")
Error in Ops.factor(factor("a", levels = c("a", "b")), factor("a")) :
level sets of factors are different
You could us apply
: 你可以apply
:
A = read.table(text="
Ad1 Ad2 Ad3 Ad4
AA 101 0 10
AA 101 10 12
AA 101 12 15
AA 101 15 20
AA 300 0 100
", header=T)
B = read.table(text="
Bd1 Bd2 Bd3
AA 101 0
AA 101 1
AA 101 2
AA 101 10
AA 101 12
", header=T)
Use apply to count the number of rows in B your condition holds for each row in A. 使用apply来计算您的条件为A中每行保留的行数。
apply(A, 1, function(x) {
sum( (x["Ad1"] == B$Bd1) &
(as.numeric(x["Ad2"]) == B$Bd2) &
(as.numeric(x["Ad3"]) <= B$Bd3) &
(as.numeric(x["Ad4"]) > B$Bd3) )
})
[1] 3 1 1 0 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.