R (hclust) 中的聚類分析：如何確定驅動聚類的變量

Question

我正在使用 hclust 對采樣點的植物物種覆蓋數據進行聚類分析。

我的研究在 100 個地點觀察到 55 種物種的覆蓋率。 每個地點的植物覆蓋度以 0-4 的覆蓋等級進行測量，其中 0 不存在，“1”是 1-25% 的覆蓋率……“4”是 76-100% 的覆蓋率。

我正在使用歐幾里得距離來測量站點之間的物種覆蓋差異，並且我想知道哪些植物物種正在驅動樹狀圖的每個分支的分組。 請參閱下面的示例 df 和代碼； 每行代表一個站點。

在簡化的示例中，我可以看到 sp1 正在推動站點 3 和 4 的關聯。在我非常大的數據集中，我如何確定哪些物種正在/正在推動我樹狀圖不同級別的關聯？

如果我能澄清，請告訴我。 謝謝你的幫助！

library(tidyverse)

site <- c(1:10)
sp1 <- c(0,1,4,4,3,3,2,1,0,2)
sp2 <- c(4,3,0,0,2,2,3,2,1,3)
sp3 <- c(3,2,1,1,2,2,3,2,1,3)
sp4 <- c(2,4,1,0,1,2,3,4,3,1)
df <- data.frame(site, sp1, sp2, sp3, sp4)

species <- select(df, sp1:sp4)

dend <- species %>% 
  dist(method = "euclidean") %>% 
  hclust(method = "ward.D") %>% 
  as.dendrogram()

plot(dend, ylab = "Euclidan Distance")

Answer 1

跟進：我最終將每個集群中的站點分配給任意關聯組，然后使用來自 indicspecies 的 multipatt function 對關聯組進行指標物種分析。 這使我能夠確定顯着推動不同群體聚集的物種。

clusters <- df %>% mutate(Association = 
                  case_when(site %in% c(3, 4)~1, 
                            site %in% c(2, 8, 9)~2, 
                            site %in% c(1, 5, 6, 7, 10)~3))

abundance = clusters[2:5]
association = clusters$Association

indicator_r.g = multipatt(abundance, association, func = "r.g", control = how(nperm=9999))
summary(indicator_r.g)


Multilevel pattern analysis
 ---------------------------

 Association function: r.g
 Significance level (alpha): 0.05

 Total number of species: 4
 Selected number of species: 4 
 Number of species associated to 1 group: 3 
 Number of species associated to 2 groups: 1 

 List of species associated to each combination: 

 Group 1  #sps.  1 
    stat p.value  
sp1 0.82  0.0193 *

 Group 2  #sps.  1 
     stat p.value  
sp4 0.832  0.0161 *

 Group 3  #sps.  1 
     stat p.value  
sp3 0.781  0.0317 *

 Group 2+3  #sps.  1 
     stat p.value  
sp2 0.844  0.0293 *
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

R (hclust) 中的聚類分析：如何確定驅動聚類的變量

問題描述

1 個解決方案

解決方案1
0 已采納 2021-10-23 23:45:57

R (hclust) 中的聚類分析：如何確定驅動聚類的變量

問題描述

1 個解決方案

解決方案1 0 已采納 2021-10-23 23:45:57

解決方案1
0 已采納 2021-10-23 23:45:57