简体   繁体   English

如何在R中的数据帧的2D矩阵中找到重复的值/数据点及其索引?

[英]How can I find repeated values/ data points and their index in 2D matrix of a dataframe in R?

For example suppose I have matrix A 例如,假设我有矩阵A

  x  y z    f
1 1  2 A 1005
2 2  4 B 1002
3 3  2 B 1001
4 4  8 C 1001
5 5 10 D 1004
6 6 12 D 1004
7 7 11 E 1005
8 8 14 E 1003

From this matrix I want to find the repeated values like 1001, 1005, D, 2 (in third column) and I also want to find their index (which row, or which position). 从这个矩阵中,我想找到重复的值,例如1001、1005,D,2(在第三列中),我还想找到它们的索引(哪一行或哪一个位置)。

I am new to R! 我是R的新手! Obviously it is possible to do with simple searching element by element by using a for loop, but I want to know, is there any function available in R for this kind of problem. 显然,可以使用for循环逐个元素地进行简单搜索,但是我想知道,R中是否有针对此类问题的函数。

Furthermore, I tried using duplicated and unique, both functions are giving me the duplicated row number or column number, they are also giving me how many of them were repeated, but I can not search for whole matrix using both of them! 此外,我尝试使用重复的和唯一的,这两个函数都为我提供了重复的行号或列号,它们还为我提供了重复的行数,但是我无法使用这两个函数来搜索整个矩阵!

You can write a rather simple function to get this information. 您可以编写一个相当简单的函数来获取此信息。 Though note that this solution works with a matrix . 但是请注意,此解决方案适用于matrix It does not work with a data.frame . 它不适用于data.frame A similar function could be written for a data.frame using the fact that the data.frame data structure is a subset of a list. 使用data.frame数据结构是列表的子集这一事实,可以为data.frame编写类似的函数。

# example data
set.seed(234)
m <- matrix(sample(1:10, size=100, replace=T), 10)

find_matches <- function(mat, value) {
  nr <- nrow(mat)
  val_match <- which(mat == value)
  out <- matrix(NA, nrow= length(val_match), ncol= 2)
  out[,2] <- floor(val_match / nr) + 1
  out[,1] <- val_match %% nr
  return(out)
}

R> m 
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    8    6    6    7    6    7    4   10    6     9
 [2,]    8    6    6    3   10    4    5    4    6     9
 [3,]    1    6    9    2    9    2    3    6    4     2
 [4,]    8    6    7    8    3    9    9    4    9     2
 [5,]    1    1    5    6    7    1    5    1   10     6
 [6,]    7    5    4    7    8    2    4    4    7    10
 [7,]   10    4    7    8    3    1    8    6    3     4
 [8,]    8    8    2    2    7    5    6    4   10     4
 [9,]   10    2    9    6    6    9    7    2    4     7
[10,]    3    9    9    4    2    7    7    2    9     6
R> find_matches(m, 8)
      [,1] [,2]
 [1,]    1    1
 [2,]    2    1
 [3,]    4    1
 [4,]    8    1
 [5,]    8    2
 [6,]    4    4
 [7,]    7    4
 [8,]    6    5
 [9,]    7    7

In this function, the row index is output in column 1 and the column index is output in column 2 在此功能中,行索引在列1中输出,列索引在列2中输出

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM