简体   繁体   English

将数据帧转换为“dist”类的对象,而无需实际计算 R 中的距离

[英]Convert a dataframe to an object of class "dist" without actually calculating distances in R

I have a dataframe with distances我有一个带有距离的数据框

df<-data.frame(site.x=c("A","A","A","B","B","C"),   
site.y=c("B","C","D","C","D","D"),Distance=c(67,57,64,60,67,60))

I need to convert this to an object of class "dist" but I do not need to calculate a distance so therefore I cannon use the dist() function.我需要将其转换为“dist”类的对象,但我不需要计算距离,因此我无法使用 dist() 函数。 Any advice?有什么建议吗?

There is nothing stopping you from creating the dist object yourself.没有什么可以阻止您自己创建 dist 对象。 It is just a vector of distances with attributes that set up the labels, size, etc.它只是具有设置标签、大小等属性的距离向量。

Using your df , this is how使用你的df ,这是如何

dij2 <- with(df, Distance)
nams <- with(df, unique(c(as.character(site.x), as.character(site.y))))
attributes(dij2) <- with(df, list(Size = length(nams),
                                  Labels = nams,
                                  Diag = FALSE,
                                  Upper = FALSE,
                                  method = "user"))
class(dij2) <- "dist"

Or you can do this via structure() directly:或者您可以直接通过structure()执行此操作:

dij3 <- with(df, structure(Distance,
                           Size = length(nams),
                           Labels = nams,
                           Diag = FALSE,
                           Upper = FALSE,
                           method = "user",
                           class = "dist"))

These give:这些给出:

> df
  site.x site.y Distance
1      A      B       67
2      A      C       57
3      A      D       64
4      B      C       60
5      B      D       67
6      C      D       60
> dij2
   A  B  C
B 67      
C 57 60   
D 64 67 60
> dij3
   A  B  C
B 67      
C 57 60   
D 64 67 60

Note : The above do no checking that the data are in the right order.注意:以上不检查数据是否按正确顺序排列。 Make sure you have the data in df in the correct order as you do in the example;确保df中的数据按照示例中的正确顺序; ie sort by site.x then site.y before you run the code I show.即排序site.x然后site.y你跑我秀的代码之前。

I had a similar problem not to long ago and solved it like this:不久前我遇到了类似的问题,并像这样解决了它:

n <- max(table(df$site.x)) + 1  # +1,  so we have diagonal of 
res <- lapply(with(df, split(Distance, df$site.x)), function(x) c(rep(NA, n - length(x)), x))
res <- do.call("rbind", res)
res <- rbind(res, rep(NA, n))
res <- as.dist(t(res))

?as.dist()应该可以帮助您,尽管它需要一个矩阵作为输入。

For people coming in from google... The acast function in the reshape2 library is way easier for this kind of stuff.对于从谷歌进来的人...... reshape2 库中的 acast 函数对于这种东西来说更容易。

library(reshape2)
acast(df, site.x ~ site.y, value.var='Distance', fun.aggregate = sum, margins=FALSE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM