简体   繁体   English

从修改后的距离矩阵计算加权距离

[英]Calculate the weighted distance from a modified distance matrix

I got a modified distance matrix where I want to use the transformed (normalized) distance in the creation of a variable.我得到了一个修改的距离矩阵,我想在创建变量时使用转换后的(归一化)距离。 Below, I have some code that produces an example data.下面,我有一些生成示例数据的代码。

set.seed(12)

size <- sample(100:1000, 7)
var <- c("V3", "V4", "V5", "V6", "V7", "V8", "V9")
dist <- matrix(runif(100), nrow = 7, ncol = 7)
diag(dist) <- 0

df <- as.data.frame(cbind(var, size, dist))

This leads to a dataset looking like this:这导致数据集如下所示:

  var size                  V3                V4                 V5                V6                V7                 V8                V9
1  V3  549                   0 0.264918377622962  0.787836347473785 0.439429325051606 0.941087544662878   0.97763589094393 0.774718186818063
2  V4  445  0.0228777434676886                 0 0.0978530396241695 0.669819295872003 0.693911424372345  0.197649595327675 0.394586439244449
3  V5  435 0.00832482660189271 0.457607151241973                  0 0.240883231163025 0.843702238984406  0.844225987326354 0.361513090785593
4  V6  346   0.392697197152302 0.540707547217607  0.217823043232784                 0 0.384644460165873 0.0950279189273715 0.421090044546872
5  V7  958   0.813880559289828 0.665679829893634  0.267943592974916 0.882756386883557                 0  0.381151003297418 0.322011524345726
6  V8  273    0.37624845537357 0.112698937533423  0.504767951788381 0.814063254510984  0.58848182996735                  0 0.552160830702633
7  V9  552   0.380812183720991  0.21836716751568  0.188586926786229 0.633264608215541 0.530477509833872  0.152623838977888                 0

The data consists of several variables indicating on the distance between the var and different points, where the column called V3 , V4 , and so on, is the other point, ie var == V4 distance to V5 is denoted by the column called V5 .数据由几个变量组成,指示var和不同点之间的距离,其中名为V3V4的列是另一个点,即var == V4V5距离由名为V5的列表示。 Size denotes the size. Size表示大小。

What I want to do is to calculate the weighted sum of distance , where the distance is weighted according to the size of the other point.我想要做的是计算distance的加权总和,其中距离根据另一个点的大小进行加权。 See the formula below:请参阅以下公式: 在此处输入图片说明

where Si is the size of unit i , (the variable is called size ).其中Si是单位i的大小(变量称为size )。 Di is the normalized distance between one point (ie column var3 , var4 , var5 ...) to the i th point, and the summation is over all k units. Di是一个点(即列var3var4var5 ...)到第i个点之间的归一化距离,并且总和在所有k 个单位上。

For example, Di can be the distance from the given point V3 to V4 ( 0.264918377622962 ), and then the Si is the size of var == V4 (ie 445 )例如, Di可以是给定点V3V4的距离( 0.264918377622962 ),那么Si就是var == V4size (即445

How do I perform this calculation when my data looks like this?当我的数据看起来像这样时,如何执行此计算?

Thanks!谢谢!

Perhaps this is what you are looking for?也许这就是你要找的?

Working column-wise, we divide the size of each point by its distance from the column representing the point in question (1:7).逐列工作,我们将每个点的size除以它与代表相关点的列的距离 (1:7)。 Obviously we exclude the diagonal.显然我们排除了对角线。 Summing the result gives us the weighted size for that point对结果求和为我们提供了该点的加权大小

set.seed(12)

size <- sample(100:1000, 7)
var <- c("V3", "V4", "V5", "V6", "V7", "V8", "V9")
dist <- matrix(runif(49), nrow = 7, ncol = 7)
diag(dist) <- 0

df <- as.data.frame(cbind(var, size, dist))

df$WS <- sapply(seq(nrow(df)), 
         function(i) sum(as.numeric(as.character((df[[2]][-i]))) / 
                         as.numeric(as.character(df[[i + 2]][-i]))))

df$WS
#> [1] 75937.840 10052.202 13876.181  6011.826  4144.254 13099.493  7330.831

Created on 2020-11-13 by the reprex package (v0.3.0)reprex 包(v0.3.0) 于 2020 年 11 月 13 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM