简体   繁体   English

将点误差信息纳入距离函数-如何在R中做到这一点?

[英]Incorporating point error information into a distance function--how to do it in R?

I have been working with the proxy package in R to implement a distance measure that weights Euclidean distance by the propagated errors of each individual point. 我一直在使用R中的代理包来实现一种距离度量,该度量通过每个点的传播误差来加权欧几里得距离。 The formula to do this is 做到这一点的公式是

sqrt((x i - x j ) 2 ) + (y i - y j ) 2 ) + ...(n i - n j ) 2 ) ÷ sqrt((σx i 2 + σx j 2 ) + (σy i 2 + σy j 2 ) + ...(σn i 2 + σn j 2 )). SQRT((X I - X j)的 2)+(Y I - Ÿj)的 2)+ ...(N I - N的j)的 2)÷SQRT((ΣXI 2 +ΣXĴ2)+(ΣY 2 +ΣYĴ2)+ ...(σNI 2 +σNĴ2))。

I was able to get proxy to work for me in a basic sense (see proxy package in R, can't make it work ) and replicated plain Euclidean distance functionality, hooray for the amateur. 我能够从基本的意义上使代理工作(请参阅R中的代理程序包,不能使其工作 ),并复制普通的欧几里得距离功能,为业余爱好者们欢呼。

However, once I started writing the function for the error-weighted distance, I immediately ran into a difficulty: I need to read in the errors as distinct from the points and have them processed distinctly. 但是,一旦我开始为误差加权距离编写函数,我立即遇到一个困难:我需要读取与点不同的误差,并对它们进行不同的处理。

I know that R has very strong functionality and I'm sure it can do this, but for the life of me, I don't know how. 我知道R具有非常强大的功能,并且我敢肯定它可以做到这一点,但是对于我自己的一生,我不知道该怎么做。 It looks like proxy's dist can handle two matrix inputs, but how would I tell it that matrix X is the points and matrix Y is the errors, and then have each go to its appropriate part of the function before being ultimately combined into the distance measure? 看起来proxy的dist可以处理两个矩阵输入,但是我怎么告诉它矩阵X是点,矩阵Y是误差,然后在最终合并为距离度量之前,先将其移至函数的适当部分?

I had been hoping to use proxy directly, but I also realized that it looks like I can't. 我一直希望直接使用代理,但是我也意识到自己似乎无法使用。 I believe I was able to come up with a function that works. 我相信我能够提出一个有效的功能。 First, the distance function: 一,距离功能:

DistErrAdj <- function(x,y) {
sing.err <- sqrt((x^2) + (y^2))
sum(sing.err)
}

Followed, of course, by 其次,当然是

library(proxy)
pr_DB$set_entry(FUN=DistErrAdj,names="DistErrAdj")

Then, I took code already kindly written from augix ( http://augix.com/wiki/Make%20trees%20in%20R,%20test%20its%20stability%20by%20bootstrapping.html ) and altered it to suit my needs, to wit: 然后,我从augix( http://augix.com/wiki/Make%20trees%20in%20R,%20test%20its%20stability%20by%20bootstrapping.html )那里已经好心编写的代码进行了修改,以适应我的需要,以机智:

boot.errtree <- function(x, q, B = 1001, tree = "errave") {
library(ape)
library(protoclust)
library(cluster)
library(proxy)
    func <- function(x,y) {
        tr = agnes((dist(x, method = "euclidean")/dist(q, method = "DistErrAdj")), diss = TRUE, method = "average")
        tr = as.phylo(as.hclust(tr))
        return(tr)
    }
    if (tree == "errprot") {
        func <- function(x,y) {
            tr = protoclust((dist(x, method = "euclidean")/dist(q, method = "DistErrAdj")))
            tr = as.phylo(tr)
            return(tr)
        }
    }
    if (tree == "errdiv") {
        func <- function(x,y) {
            tr = diana((dist(x, method = "euclidean")/dist(q, method = "DistErrAdj")), diss=TRUE)
            tr = as.phylo(as.hclust(tr))
            return(tr)
        }
    }
tr_real = func(x)
    plot(tr_real)
    bp <- boot.phylo(tr_real, x, FUN=func, B=B)
    nodelabels(bp)
    return(bp)
}

It seems to work. 它似乎有效。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM