繁体   English   中英

将命名向量转换为 R 中的对称矩阵?

[英]Turn a named vector into a symmetric matrix in R?

已经提出了类似的问题,但是,没有一个具有必须拆分矢量名称的附加元素,所以我提出了一个新问题。

我正在尝试将命名向量转换为 R 中的对称矩阵。 我的向量包含矩阵中每个值组合的名称。 所以我需要将名称拆分为它们的组成部分。

例如,如果我的数据如下所示:

v <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

我们可以看到每个名称都是 2 个变量的组合。 我试图把它变成一个对称矩阵(对角线为零)。 为了清楚起见,我想要的 output 看起来像:

           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

关于我如何做到这一点的任何建议?

编辑

由于其中一个答案突出显示我的问题太模糊,我正在进行编辑以反映这一点。 无论向量中的名称是什么,我都在寻找这个问题的通用解决方案。 例如,我的命名向量可能如下所示:

v <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)

igraph的一个选项

library(igraph)

cbind(read.table(text = names(v)), v) %>%
  graph_from_data_frame(directed = FALSE) %>%
  get.adjacency(attr = "v", sparse = FALSE)

           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

一个基础 R 选项

> d <- read.table(text = names(v))

> xtabs(v ~ ., cbind(rbind(d, setNames(rev(d), names(d))), v = rep(v, 2)))
    V2
V1           x1         x2         x3         x4         x5
  x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
  x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
  x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
  x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
  x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

我们可以拆分名称,扩展数据以创建缺失的组合( complete )并使用pivot_wider重塑为宽

library(dplyr)
library(tidyr)
library(stringr)
library(tibble)
d1 <- read.table(text = names(v), header = FALSE)
un1 <- sort(unique(unlist(d1)))
out <- d1%>% 
   mutate(v = v) %>% 
   complete(V1 = un1, V2 = un1, 
     fill = list(v = 0)) %>% 
   pivot_wider(names_from = V1, values_from = v) %>% 
   column_to_rownames('V2') %>% 
   as.matrix %>% 
   {. + t(.)}

-输出

> out
           x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

或使用base R

d1 <- read.table(text = names(v))
un1 <- sort(unique(unlist(d1)))
m1 <- matrix(0, ncol = length(un1), nrow = length(un1), dimnames = list(un1, un1))
m2 <- xtabs(v ~ ., d1)
m1[row.names(m2), colnames(m2)] <- m2
m1 <- m1 + t(m1)

-输出

m1
     x1         x2         x3         x4         x5
x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

使用第二个例子

> m1
            apple     banana     orange       pear       plum
apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

1)我们使用scan生成顶点v ,然后使用嵌套的sapply生成所需的矩阵。 不使用任何包。

edge2adj <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  sapply(v, function(i) sapply(v, function(j) 
    Find(Negate(is.na), c(e[paste(i, j)], e[paste(j, i)], 0) )))
}


# tests where v1 and v2 are the two examples in the question

edge2adj(v1)
##            x1         x2         x3         x4         x5
## x1 0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## x2 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## x3 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## x4 0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## x5 0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

edge2adj(v2)
##             apple     banana     orange       pear       plum
## apple  0.00000000 0.81899860 0.10764701 0.03457240 0.04041266
## banana 0.81899860 0.00000000 0.03923967 0.05954789 0.05421003
## orange 0.10764701 0.03923967 0.00000000 0.15535316 0.09198977
## pear   0.03457240 0.05954789 0.15535316 0.00000000 0.15301872
## plum   0.04041266 0.05421003 0.09198977 0.15301872 0.00000000

2) (1) 可能比这种替代方案更可取,因为它具有更大的通用性,但我们指出,如果我们知道边缘按照问题中显示的顺序(排序和上三角顺序),那么我们可以使用upper.tri像这样。 不使用任何包。

edge2adj2 <- function(e) {
  v <- sort(unique(scan(text = names(e), what = "", quiet = TRUE)))
  m <- sapply(v, function(i) sapply(v, function(j) 0))
  m[upper.tri(m)] <- e
  m + t(m)
}

identical(edge2adj(v1), edge2adj2(v1))
## [1] TRUE

identical(edge2adj(v2), edge2adj2(v2))
## [1] TRUE

笔记

v1 <- c(
  "x1 x2" = 0.81899860,
  "x1 x3" = 0.10764701,
  "x2 x3" = 0.03923967,
  "x1 x4" = 0.03457240,
  "x2 x4" = 0.05954789,
  "x3 x4" = 0.15535316,
  "x1 x5" = 0.04041266,
  "x2 x5" = 0.05421003,
  "x3 x5" = 0.09198977,
  "x4 x5" = 0.15301872
)

v2 <- c(
  "apple banana" = 0.81899860,
  "apple orange" = 0.10764701,
  "banana orange" = 0.03923967,
  "apple pear" = 0.03457240,
  "banana pear" = 0.05954789,
  "orange pear" = 0.15535316,
  "apple plum" = 0.04041266,
  "banana plum" = 0.05421003,
  "orange plum" = 0.09198977,
  "pear plum" = 0.15301872
)

另一种base R方式:

comb = names(v)
inds = sapply(comb, function(x){
        c(unlist(strsplit(x = x,split = " ",fixed = TRUE)))},
        simplify = TRUE)
inds1 = rbind(inds[2,],inds[1,])


m = matrix(data = numeric(25), nrow = 5,ncol = 5,dimnames = list(paste0("x",1:5),paste0("x",1:5)))

m[t(inds)]=v
m[t(inds1)]=v

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM