[英]Turn a named vector into a symmetric matrix in R?
已经提出了类似的问题,但是,没有一个具有必须拆分矢量名称的附加元素,所以我提出了一个新问题。
我正在尝试将命名向量转换为 R 中的对称矩阵。 我的向量包含矩阵中每个值组合的名称。 所以我需要将名称拆分为它们的组成部分。
例如,如果我的数据如下所示:
v <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
我们可以看到每个名称都是 2 个变量的组合。 我试图把它变成一个对称矩阵(对角线为零)。 为了清楚起见,我想要的 output 看起来像:
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
关于我如何做到这一点的任何建议?
由于其中一个答案突出显示我的问题太模糊,我正在进行编辑以反映这一点。 无论向量中的名称是什么,我都在寻找这个问题的通用解决方案。 例如,我的命名向量可能如下所示:
v <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
igraph
的一个选项
library(igraph)
cbind(read.table(text = names(v)), v) %>%
graph_from_data_frame(directed = FALSE) %>%
get.adjacency(attr = "v", sparse = FALSE)
给
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
一个基础 R 选项
> d <- read.table(text = names(v))
> xtabs(v ~ ., cbind(rbind(d, setNames(rev(d), names(d))), v = rep(v, 2)))
V2
V1 x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
我们可以拆分名称,扩展数据以创建缺失的组合( complete
)并使用pivot_wider
重塑为宽
library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>%
mutate(v = v) %>%
complete(V1 = un1, V2 = un1,
fill = list(v = 0)) %>%
pivot_wider(names_from = V1, values_from = v) %>%
column_to_rownames('V2') %>%
as.matrix %>%
{. + t(.)}
-输出
> out
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
或使用base R
d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)
-输出
m1
x1 x2 x3 x4 x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
使用第二个例子
> m1
apple banana orange pear plum
apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
1)我们使用scan
生成顶点v
,然后使用嵌套的sapply
生成所需的矩阵。 不使用任何包。
edge2adj <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
sapply(v, function(i) sapply(v, function(j)
Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}
# tests where v1 and v2 are the two examples in the question
edge2adj(v1)
## x1 x2 x3 x4 x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
edge2adj(v2)
## apple banana orange pear plum
## apple 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000
2) (1) 可能比这种替代方案更可取,因为它具有更大的通用性,但我们指出,如果我们知道边缘按照问题中显示的顺序(排序和上三角顺序),那么我们可以使用upper.tri
像这样。 不使用任何包。
edge2adj2 <- function(e) {
v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
m <- sapply(v, function(i) sapply(v, function(j) 0))
m[upper.tri(m)] <- e
m + t(m)
}
identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE
identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE
v1 <- c(
"x1 x2" = 0.81899860,
"x1 x3" = 0.10764701,
"x2 x3" = 0.03923967,
"x1 x4" = 0.03457240,
"x2 x4" = 0.05954789,
"x3 x4" = 0.15535316,
"x1 x5" = 0.04041266,
"x2 x5" = 0.05421003,
"x3 x5" = 0.09198977,
"x4 x5" = 0.15301872
)
v2 <- c(
"apple banana" = 0.81899860,
"apple orange" = 0.10764701,
"banana orange" = 0.03923967,
"apple pear" = 0.03457240,
"banana pear" = 0.05954789,
"orange pear" = 0.15535316,
"apple plum" = 0.04041266,
"banana plum" = 0.05421003,
"orange plum" = 0.09198977,
"pear plum" = 0.15301872
)
另一种base R
方式:
comb = names(v)
inds = sapply(comb, function(x){
c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])
m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))
m[t(inds)]=v
m[t(inds1)]=v
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.