简体   繁体   English

从 r 数据帧生成加权矩阵

[英]generate a weighted matrix from r dataframe

I have a toy example of a dataframe:我有一个数据框的玩具示例:

df <- data.frame(matrix(, nrow = 5, ncol = 0))
df["A|A"] <- c(0.3, 0, 0, 100, 23)
df["A|B"]= c(0, 0, 0.3, 10, 0.23)
df["A|C"]= c(0.3, 0.1, 0, 100, 2)
df["B|B"]= c(0, 0, 0, 12, 2)
df["B|B"]= c(0, 0, 0.3, 0, 0.23)
df["B|C"]= c(0.3, 0, 0, 21, 3)
df["C|A"]= c(0.3, 0, 1, 100, 0)
df["C|B"]= c(0, 0, 0.3, 10, 0.2)
df["C|C"]= c(0.3, 0, 1, 1, 0.3)

I need to get a matrix with counts of non-zero values between A and A, A and B, ..., C and C.我需要得到一个矩阵,其中包含 A 和 A、A 和 B、...、C 和 C 之间的非零值计数。

在此处输入图片说明

I started splitting the colnames and assigning them to variables.我开始拆分列名并将它们分配给变量。 But I don't know how to create a matrix with certain rows and columns in a loop但我不知道如何在循环中创建具有某些行和列的矩阵

counts <- colSums(df != 0)
df <- rbind(df, counts)
for(i in colnames(df)) {
  cluster1 <- (strsplit(i, "\\|")[[1]])[1]
  cluster2 <- (strsplit(i, "\\|")[[1]])[2]
  
}

A base R option基本 R 选项

> table(read.table(text = rep(names(df), colSums(df > 0)), sep = "|"))
   V2
V1  A B C
  A 3 3 4
  B 0 2 3
  C 3 3 4

or a longer version或更长的版本

table(
    data.frame(
        do.call(
            rbind,
            strsplit(
                as.character(subset(stack(df), values > 0)$ind),
                "\\|"
            )
        )
    )
)

gives

   X2
X1  A B C
  A 3 3 4
  B 0 2 3
  C 3 3 4

Reshape the data into 'long' format with pivot_longer , then separate the 'name' column into two, and reshape back to 'wide' with pivot_wider , specifying the values_fn as a lambda function to get the count of non-zero values重塑数据与“长”格式pivot_longer ,然后separate该“名称”栏为两个,重塑回“宽”与pivot_wider ,指定values_fn作为lambda函数得到的非零值的计数

library(dplyr)
library(tidyr)
df %>% 
    pivot_longer(cols = everything()) %>%
    separate(name, into = c('name1', 'name2')) %>%
    pivot_wider(names_from = name2, values_from = value, 
       values_fn = list(value = ~ sum(. > 0)), values_fill = 0)

-output -输出

# A tibble: 3 x 4
  name1     A     B     C
  <chr> <int> <int> <int>
1 A         3     3     4
2 B         0     2     3
3 C         3     3     4

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM