简体   繁体   English

如何从相似度数据框中创建相似度矩阵?

[英]How do I create a similarity matrix from a similarity data frame?

I found this online and used this with my data:我在网上找到了这个并将其与我的数据一起使用:

df <- data.frame(seasons = c("Season1","Season2","Season3","Season4"))
for(i in unique(df$seasons)) {
  df[[paste0(i)]] <- ifelse(df$seasons==i,1,0)
}

The only challenge is where there is a 0 in the resultant cell, I want to fill in a meaningful value from a data frame that has data arranged like so:唯一的挑战是结果单元格中有一个 0,我想从具有如下数据排列的数据框中填充一个有意义的值:

S1 S1 S2 S2 Value价值
Season1第1季 Season2第2季 3 3
Season3第三季 Season1第1季 5 5
Season2第2季 Season3第三季 4 4

Note how a season in a pair could pop up at S1 or S2.请注意一对中的一个季节如何在 S1 或 S2 出现。

I'll need to fill for example,{row Season1;例如,我需要填写 {row Season1; col Season 2} as well as {col Season 1 and row Season 2} in my matrix as 3. col Season 2} 以及我的矩阵中的 {col Season 1 and row Season 2} 为 3。

Is there anyway for me to do this?无论如何我可以这样做吗? I tried a few things but decided to give a shoutout to the community in case there is something simple out there I'm missing!我尝试了一些事情,但决定向社区大声疾呼,以防万一我错过了一些简单的事情!

Thanks a bunch!非常感谢!

There are three steps and decided to rebuild the original matrix and call it S:分为三个步骤,决定重建原始矩阵并称其为 S:

# Make square matrix of zeros
rc <- length(unique(df[[1]]) ) # going to assume that number of unique values is same in both cols
S <- diag(1, rc,rc)

# Label rows and cols
dimnames(S) <- list( sort(unique(df[[1]])), sort( unique(df[[2]])) )

# Assign value to matrix positions based on values of df[[3]]

S[ data.matrix( df[1:2])  ] <-   # using 2 col matrix indexing
    df[[3]]

# -------
> S
        Season1 Season2 Season3
Season1       1       3       0
Season2       0       1       4
Season3       5       0       1

It's now a real matrix rather than a dataframe.它现在是一个真正的矩阵而不是一个数据框。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM