[英]Convert a dataframe to an object of class "dist" without actually calculating distances in R
I have a dataframe with distances我有一个带有距离的数据框
df<-data.frame(site.x=c("A","A","A","B","B","C"),
site.y=c("B","C","D","C","D","D"),Distance=c(67,57,64,60,67,60))
I need to convert this to an object of class "dist" but I do not need to calculate a distance so therefore I cannon use the dist() function.我需要将其转换为“dist”类的对象,但我不需要计算距离,因此我无法使用 dist() 函数。 Any advice?
有什么建议吗?
There is nothing stopping you from creating the dist object yourself.没有什么可以阻止您自己创建 dist 对象。 It is just a vector of distances with attributes that set up the labels, size, etc.
它只是具有设置标签、大小等属性的距离向量。
Using your df
, this is how使用你的
df
,这是如何
dij2 <- with(df, Distance)
nams <- with(df, unique(c(as.character(site.x), as.character(site.y))))
attributes(dij2) <- with(df, list(Size = length(nams),
Labels = nams,
Diag = FALSE,
Upper = FALSE,
method = "user"))
class(dij2) <- "dist"
Or you can do this via structure()
directly:或者您可以直接通过
structure()
执行此操作:
dij3 <- with(df, structure(Distance,
Size = length(nams),
Labels = nams,
Diag = FALSE,
Upper = FALSE,
method = "user",
class = "dist"))
These give:这些给出:
> df
site.x site.y Distance
1 A B 67
2 A C 57
3 A D 64
4 B C 60
5 B D 67
6 C D 60
> dij2
A B C
B 67
C 57 60
D 64 67 60
> dij3
A B C
B 67
C 57 60
D 64 67 60
Note : The above do no checking that the data are in the right order.注意:以上不检查数据是否按正确顺序排列。 Make sure you have the data in
df
in the correct order as you do in the example;确保
df
中的数据按照示例中的正确顺序; ie sort by site.x
then site.y
before you run the code I show.即排序
site.x
然后site.y
你跑我秀的代码之前。
I had a similar problem not to long ago and solved it like this:不久前我遇到了类似的问题,并像这样解决了它:
n <- max(table(df$site.x)) + 1 # +1, so we have diagonal of
res <- lapply(with(df, split(Distance, df$site.x)), function(x) c(rep(NA, n - length(x)), x))
res <- do.call("rbind", res)
res <- rbind(res, rep(NA, n))
res <- as.dist(t(res))
?as.dist()
应该可以帮助您,尽管它需要一个矩阵作为输入。
For people coming in from google... The acast function in the reshape2 library is way easier for this kind of stuff.对于从谷歌进来的人...... reshape2 库中的 acast 函数对于这种东西来说更容易。
library(reshape2)
acast(df, site.x ~ site.y, value.var='Distance', fun.aggregate = sum, margins=FALSE)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.