简体   繁体   English

将一个矩阵的几列与R中另一矩阵的列进行比较

[英]Compare a few columns of one matrix against the column of another matrix in R

Here I have created a duplicate data set. 在这里,我创建了一个重复的数据集。

set.seed(1234)
m1 = matrix(runif(2000), nrow = 10, ncol = 200)
dim(m1)
[1]  10 200
m2 = matrix(runif(100), nrow = 10, ncol = 10)
dim(m2)
[1] 10 10

I want to compare the first 1:20 columns of m1 against the 1st column of m2 matrix. 我想将m1的前1:20列与m2矩阵的1st column进行比较。 Similarly, for the next 21:40 columns of m1 against the 2nd column of m2 matrix and so on. 同样,对于m2矩阵的2nd column ,接下来的m121:40依此类推。 Finally, 181:200 columns of matrix m1 against the 10th column of matrix m2 . 最后, 181:200矩阵的列m110th矩阵的列m2

I wrote the following code to compare 1st 20 columns of m1 matrix against the 1st column of m2 matrix. 我写了下面的代码,比较1st 20m1矩阵对1stm2矩阵。

cc = matrix(NA, nrow(m2), ncol(m2))
for (j in 1:ncol(m2)) {
  for (i in 1:nrow(m2)) {
    cc[i, j] = ifelse(m1[i, j] < m2[i,1], 1, 0)
  }
}
ccvalue = data.frame(cc)

How can I improve the above r code do the above comparison. 我怎样才能改善上面的r code做上面的比较。 Are there any r function to do? 是否有任何r函数要做?

Thank you in advance. 先感谢您。

You can take advantage of the implicit vectorization in R to run the entire matrix of m1 against the columns of m2. 您可以利用R中的隐式矢量化对m2的列运行整个m1矩阵。 You just need to get m2 to repeat columns by subsetting for the same column over and over again. 您只需通过一次又一次为同一列设置子集来使m2重复列。 For example, v <-c("A","B","C") you can do v[c(1,1,2,2,3,3)] which equals "A","A","B","B","C","C" . 例如, v <-c("A","B","C")可以执行v[c(1,1,2,2,3,3)]等于"A","A","B","B","C","C"

Test out the following code and let me know if you have any questions: 测试以下代码,如果您有任何疑问,请告诉我:

# we want to compare m1[,c(1,2,3,...)], with m2[,c(1,1,1,...)]
# summing 1,0,...,1,0,... to get 1,1,...,2,2,...
m2_to_compare <- cumsum(rep(c(1,rep(0,19)),10)) 
# length should match m1 columns
length(m2_to_compare) 
(m1 < m2[,m2_to_compare]) * 1 # turns TRUEs and FALSEs into 1s and 0s

Answering Comment: 回答评论:

cc = ifelse(m1 < m2[,m2_to_compare], 1, 0)
# depending on your seed:
sapply(1:10, function(colm) rowSums(cc[,m2_to_compare == colm]))
#      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
# [1,]    3    1   19    8    9   11   17    2   12    16
# [2,]    2   19   14   10   10   11    9    1    0    14
# [3,]   16   16   17    7    5   20    1   16    2    17
# [4,]   13    2    0   11   20   11    6    5   12     2
# [5,]    0   10    2    1   10   17    3   14    5     7
# [6,]   11    7   17    9   20   18   18   16    7     4
# [7,]   15    3    5    5    8    5    3    3    9     1
# [8,]    0   18    5    8    9   15    9   17    0    20
# [9,]   15   14    5    1    5    0    6   17   19     6
#[10,]    6    1    4   10   11   12    0    9    7     5

A few things to point out: 需要指出的几点:

(1) It would be a good practice to set the seed for matrix m2 as well. (1)最好为矩阵m2设置种子。 Perhaps you overlooked that. 也许您忽略了这一点。

(2) In your code provided, you seem to be only comparing m2 to the first 10 columns of m1 . (2)在提供的代码中,您似乎只是将m2m1的前10列进行比较。

If you only mean to compare the 10 columns, you can do it with this: 如果您只想比较10列,则可以执行以下操作:

cc <- (m2 > m1[, c(1:10)])*1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM