简体   繁体   English

R删除两个稀疏矩阵的重复元素

[英]R remove duplicated elements for two sparse matrix

We have a sparse matrix using library Matrix: 我们有一个使用矩阵Matrix的稀疏矩阵:

library(Matrix)
M = sparseMatrix(i = uidx,j = midx,x = freq)

suppose the matrix M is like: 假设矩阵M像:

i   j   x
1   2   0.2
1   3   0.3
1   15  0.15
2   7   0.1
...
280 2   0.6
281 7   0.25

and after some calculation we got another sparse matrix Q 经过一些计算,我们得到了另一个稀疏矩阵Q

i   j   x
1   2   18
1   4   16
1   9   8
2   10  19
...

I want to use Q as base matrix and remove those (i,j) already exists in M from Q 我想使用Q作为基本矩阵,并从Q中删除M中已经存在的那些(i,j)
something like a set minus: 类似于设定的负号:

Q-M

In my example it will brings result like: 在我的示例中,它将带来如下结果:

i   j   x
1   4   16
1   9   8    
...
#we have  1  2  18 in original Q but 1  2  0.2 with same index (1,2) already exists in M so remove that row from Q.

Any efficient way or existing function to do this work? 有什么有效的方法或现有功能可以完成这项工作吗?
to reproduce this case you could run the following code: 要重现这种情况,您可以运行以下代码:

library(Matrix)
M = sparseMatrix(i = c(1,1,1),j = c(2,3,15),x = c(0.2,0.3,0.15))
Q = sparseMatrix(i = c(1,1,1),j = c(2,4,9),x = c(18,16,8))
#result should produce a sparse matrix like:
#R = sparseMatrix(i = c(1,1),j = (4,9),x = c(16,8))

You can get there with using the summary function when the Matrix package is loaded. 加载Matrix包时,可以使用summary函数到达那里。 This give a full overview of the sparse matrix (and keeping it as a sparse matrix). 这给出了稀疏矩阵的完整概述(并将其保留为稀疏矩阵)。 Based on this, you can compare values directly. 基于此,您可以直接比较值。 And to select you can compare them to each other. 并选择您可以将它们彼此比较。 I expanded the example a bit to check if other values are being kept / removed as expected. 我对示例进行了扩展,以检查是否按预期方式保留/删除了其他值。 The result matches what you expect from your R matrix. 结果与您对R矩阵的期望匹配。

library(Matrix)
M = sparseMatrix(i = c(1,1,1,1, 2, 2),
                 j = c(2,3,15, 16, 4, 8),
                 x = c(0.2,0.3,0.15, 0.16, 0.2, 0.08))
Q = sparseMatrix(i = c(1,1,1,1, 2),
                 j = c(2,4,9,16, 4),
                 x = c(18,16,8,50, 40))

#result should produce a sparse matrix like:
R = sparseMatrix(i = c(1,1), 
                 j = c(4,9), 
                 x = c(16, 8))


# creates a summary of the sparse matrices (summary is coming from Matrix)
summary_m <- summary(M)
summary_q <- summary(Q)

# which records to remove
# all records where i and j match (TRUE). Exclude x value in matching comparison.
# summed this should be 2.
# which shows which records are equal and should be removed.
remove <- which(rowSums(summary_m[, c("i", "j") ] == summary_q[, c("i", "j") ]) == 2)

# build summary sparse matrix from summary_q to keep all Q records that do not match M
q_left <- summary_q[-remove, ]

# build full sparse matrix
result <- sparseMatrix(i = q_left$i, j = q_left$j, x = q_left$x)

identical(result, R)
[1] TRUE

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM