matrix index subsetting with another matrix

Question

what's a fast way to match two matrices (one and two) together and to extract the index of matrix two for the matches. Matrix two is large (hundreds to thousands of rows).

one
[,1] [,2]
   9   11
  13    2


head(two)
   [,1][,2]
[1,] 9 11
[2,] 11 9
[3,]  2 3
[4,] 13 2
[5,]  2 4
[6,]  3 3

The output should be (notice how index 2 is not an output value)

1 4

Answer 1

One way of doing this :

a = apply(one, 1, paste0, collapse = "-")
b = apply(two, 1, paste0, collapse = "-")
match(a, b)

#[1] 1 4

We paste all the columns together row-wise for both the matrices and then match them to get the rows which are same.

Just for reference,

a
#[1] "9-11" "13-2"
b
#[1] "9-11" "11-9" "2-3"  "13-2" "2-4"  "3-3"

Answer 2

You could write a C++ loop to do it fairly quick

library(Rcpp)

cppFunction('NumericVector matrixIndex(NumericMatrix m1, NumericMatrix m2){

int m1Rows = m1.nrow();
int m2Rows = m2.nrow();
NumericVector out;  

for (int i = 0; i < m1Rows; i++){
  for (int j = 0; j < m2Rows; j++){

    if(m1(i, 0) == m2(j, 0) && m1(i, 1) == m2(j, 1)){
        //out[j] = (j+1);
        out.push_back(j + 1);
    }
  }
}

return out;

}')

matrixIndex(m1, m2)
[1] 1 4

Although I suspect it would be faster to pre-allocate the result vector first, something like

cppFunction('NumericVector matrixIndex(NumericMatrix m1, NumericMatrix m2){

int m1Rows = m1.nrow();
int m2Rows = m2.nrow();
NumericVector out(m2Rows);  

for (int i = 0; i < m1Rows; i++){
  for (int j = 0; j < m2Rows; j++){

    if(m1(i, 0) == m2(j, 0) && m1(i, 1) == m2(j, 1)){
        out[j] = (j+1);
        //out.push_back(j + 1);
    }
  }
}

return out;

}')

matrixIndex(m1, m2)
[1] 1 0 0 4 0 0
## 0 == nomatch.

Answer 3

You don't say if by "fast" you mean compute time or person time. If it only needs doing once, the overall time is probably shortest if you optimize person time, and Ronak's answer is going to be hard to beat, it's clear and robust.

If the numbers are all less than a certain number (say, 100, as in your example data), you can do a similar thing but use arithmetic to combine the two columns together and then match. I suspect (but haven't tested) that this would be faster than converting to character vectors. There are of course other arithmetic options too depending on your circumstance.

a <- one[,1]*100 + one[,2]
b <- two[,1]*100 + two[,2]
match(a, b)

Answer 4

We can use %in%

which(do.call(paste, as.data.frame(two)) %in% do.call(paste, as.data.frame(one)))
#[1] 1 4

matrix index subsetting with another matrix

Question

4 answers

solution1
4 2017-06-02 02:01:00

solution2
1 ACCPTED 2017-06-02 02:07:52

solution3
1 2017-06-02 02:22:23

solution4
0 2017-06-02 03:46:57

matrix index subsetting with another matrix

Question

4 answers

solution1 4 2017-06-02 02:01:00

solution2 1 ACCPTED 2017-06-02 02:07:52

solution3 1 2017-06-02 02:22:23

solution4 0 2017-06-02 03:46:57

solution1
4 2017-06-02 02:01:00

solution2
1 ACCPTED 2017-06-02 02:07:52

solution3
1 2017-06-02 02:22:23

solution4
0 2017-06-02 03:46:57