简体   繁体   中英

Optimise row wise matrix comparison in R

I've googled extensively and can't seem to find an answer to my problem. Apologies if this has been asked before. I have two matrices, a & b, each with the same dimensions. What I am trying to do is iterate over the rows of a (from i = 1 to number of rows in a) and check if any elements found in row i of matrix a appear in the corresponding row in matrix b. I have a solution using sapply but this becomes quite slow with very large matrices. I wondered if it is possible to vectorise my solution somehow? Examples below:

# create example matrices
a = matrix(
  1:9,
  nrow = 3
)

b = matrix(
  4:12,
  nrow = 3
)

# iterate over rows in a....
# returns TRUE for each row of a where any element in ith row is found in the corresponding row i of matrix b
sapply(1:nrow(a), function(x){ any(a[x,] %in% b[x,])})

# however, for large matrices this performs quite poorly. is it possible to vectorise?

a = matrix(
  runif(14000000),
  nrow = 7000000
)

b = matrix(
  runif(14000000),
  nrow = 7000000
)

system.time({
 sapply(1:nrow(a), function(x){ any(a[x,] %in% b[x,])})
})


Use apply to find any 0 differences:

a <- sample(1:3, 9, replace = TRUE)
b <- sample(1:3, 9, replace = TRUE)
a <- matrix(a, ncol = 3)
b <- matrix(b, ncol = 3)

diff <- (a - b)
apply(diff, 1, function(x) which(x == 0)) # actual indexes = 0
apply(diff, 1, function(x) any(x == 0)) # row check only

or

Maybe you can try intersect + asplit like below

lengths(Map(intersect, asplit(a, 1), asplit(b, 1))) > 0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM