如何从列表中删除浮点错误中重复的矩阵？

Question

This question is similar to questions that have been asked regarding floating-point error in other languages (for example here ), however I haven't found a satisfactory solution. 这个问题类似于其他语言中有关浮点错误的问题（例如这里），但是我没有找到满意的解决方案。

I'm working on a project that involves investigating matrices that share certain characteristics. 我正在开展一个涉及调查具有某些特征的矩阵的项目。 As part of that, I need to know how many matrices in a list are unique. 作为其中的一部分，我需要知道列表中有多少矩阵是唯一的。

 D <- as.matrix(read.table("datasource",...))
 mat_list <- vector('list',length=length(samples_list))
 mat_list <- lapply(1:length(samples_list),function(i) matrix(data=0,nrow(D),ncol(D)))

This list is then populated by computations from the data based on the elements of samples_list . 然后，基于samples_list的元素，通过来自数据的计算来填充该列表。 After mat_list has been populated, I need to removed duplicates. 填充mat_list后，我需要删除重复项。 Running 运行

mat_list <- unique(mat_list)

narrows things down quite a bit; 把事情缩小了很多; however, many of those elements are really within machine error of each other. 但是，其中许多元素实际上都是彼此的机器错误。 The function unique does not allow one to specify precision, and I was unable to find source code for modification. unique函数不允许指定精度，我无法找到修改源代码。

One idea I had was this: 我有一个想法是：

ErrorReduction<-function(mat_list, tol=2){
  len <- length(mat_list)
  diff <- mat_list[[i]]-mat_list[[i+1]]
  for(i in 1:len-1){
     if(norm(diff,"i")<tol){
     mat_list[[i+1]] <- mat_list[i]
     }
  }
  mat_list<-unique(mat_list)
  return(mat_list)
}

but this only looks at pairwise differences. 但这只关注成对差异。 It would be simple but most likely inefficient to do this with nested for loops. 使用嵌套for循环来做这件事很简单但很可能效率低下。

What methods do you know of, or what ideas do you have, of handling the problem of identifying and removing matrices that are within machine error of being duplicates? 您知道哪些方法或者您有什么想法来处理识别和删除机器错误中重复的矩阵的问题？

Answer 1

Here is a function that applies all.equal to every pair using outer and removes all duplicates: 这是一个函数，它使用outer将all.equal应用于每对，并删除所有重复项：

approx.unique <- function(l) {
   is.equal.fun <- function(i, j)isTRUE(all.equal(norm(l[[i]] - l[[j]], "M"), 0))
   is.equal.mat <- outer(seq_along(l), seq_along(l), Vectorize(is.equal.fun))
   is.duplicate <- colSums(is.equal.mat * upper.tri(is.equal.mat)) > 0
   l[!is.duplicate]
}

An example: 一个例子：

a <- matrix(runif(12), 4, 3)
b <- matrix(runif(12), 4, 3)
c <- matrix(runif(12), 4, 3)

all <- list(a1 = a, b1 = b, a2 = a, a3 = a, b2 = b, c1 = c)

names(approx.unique(all))
# [1] "a1" "b1" "c1"

Answer 2

I believe you are looking for all.equal which compares objects 'within machine error'. 我相信你正在寻找all.equal来比较机器错误中的对象'。 Check out ?all.equal . 退房?all.equal 。

如何从列表中删除浮点错误中重复的矩阵？

问题描述

2 个解决方案

解决方案1
6 已采纳 2013-04-30 00:16:41

解决方案2
1 2013-04-30 00:10:34

如何从列表中删除浮点错误中重复的矩阵？

问题描述

2 个解决方案

解决方案1 6 已采纳 2013-04-30 00:16:41

解决方案2 1 2013-04-30 00:10:34

解决方案1
6 已采纳 2013-04-30 00:16:41

解决方案2
1 2013-04-30 00:10:34