如何获得两个矩阵的交叉？

Question

# These are the two matrices that I would like to subset based on identical
# entries within entire rows.
mata <- matrix(c("A", "B", "C", "F", "D", "E", "F", "G"), 
               nrow = 4, ncol = 2,
               dimnames = list(c(), c("A", "B")))
mata

##      A   B  
## [1,] "A" "D"
## [2,] "B" "E"
## [3,] "C" "F"
## [4,] "F" "G"

matb <- matrix(c("B", "A", "C", "F", "M", "D", "D", "H", "G", "X"), 
               nrow = 5, ncol = 2,
               dimnames = list(c(), c("A", "B")))
matb

##      A   B  
## [1,] "B" "D"
## [2,] "A" "D"
## [3,] "C" "H"
## [4,] "F" "G"
## [5,] "M" "X"

If the two matrices were not unordered and of the same length, the following code should work and would be efficient. 如果两个矩阵没有无序且长度相同，则以下代码应该起作用并且效率很高。

mata[rowMeans(mata == matb) == 1, ]

A hackish solution of mine would be the concatenation of the individual columns of each matrix that I want to use for the matching. 我的一个hackish解决方案是我想要用于匹配的每个矩阵的各列的串联。 In this example I will use all columns. 在这个例子中，我将使用所有列。

mata <- cbind(mata, C = paste0(mata[, "A"], "_", mata[, "B"]))
matb <- cbind(matb, C = paste0(matb[, "A"], "_", matb[, "B"]))
mata[mata[, "C"] %in% matb[, "C"], colnames(mata) != "C"]

##      A   B  
## [1,] "A" "D"
## [2,] "F" "G"

This is the result that I am looking for, but I am wondering whether there is something more elegant such as the %in% function for vectors. 这是我正在寻找的结果，但我想知道是否有一些更优雅的东西，例如矢量的%in%函数。

Edit 编辑

The solution should apply to general cases where the matrices are not necessarily of equal length. 该解决方案应适用于矩阵长度不一定相等的一般情况。

Answer 1

You could use the function merge() for this: 你可以使用函数merge() ：

> merge(mata,matb)
  A B
1 A D
2 F G

Answer 2

If you load dplyr, intersect.data.frame is added: 如果加载dplyr，则添加intersect.data.frame ：

library(dplyr)
options(stringsAsFactors=FALSE)
dfa <- as.data.frame(mata)
dfb <- as.data.frame(matb)
intersect(dfa,dfb)

#   A B
# 1 A D
# 2 F G

Similarly, union , setequal (testing set equality) and setdiff (set minus) are available. 类似地， union ， setequal （测试集相等）和setdiff （set minus）可用。

Aside. 在旁边。 Each row of a data.frame corresponds to an observation, so it makes sense to talk about intersecting two sets of observations (two data.frames). data.frame的每一行对应一个观察，因此讨论交叉的两组观察（两个data.frames）是有意义的。 For matrices, however, it really does not make sense. 然而，对于矩阵来说，它确实没有意义。 That's why hacks like the OP's solution and @RHertel's (which coerces to data.frame behind the scenes) are needed for this operation if you want to continue using matrices. 这就是为什么如果你想继续使用矩阵，这个操作需要像OP的解决方案和@ RHertel（它强制在幕后的data.frame）这样的黑客攻击。

如何获得两个矩阵的交叉？

问题描述

Edit 编辑

2 个解决方案

解决方案1
4 已采纳 2015-08-10 14:12:05

解决方案2
4 2015-08-10 14:22:49

如何获得两个矩阵的交叉？

问题描述

Edit 编辑

2 个解决方案

解决方案1 4 已采纳 2015-08-10 14:12:05

解决方案2 4 2015-08-10 14:22:49

解决方案1
4 已采纳 2015-08-10 14:12:05

解决方案2
4 2015-08-10 14:22:49