简体   繁体   English

使用''ward''方法与pvclust R包时出错

[英]Error when using ''ward'' method with pvclust R package

I am having some troubles regarding a cluster analysis that I am trying to do with the pvclust package. 我正在尝试使用pvclust包进行聚类分析时遇到一些麻烦。

Specifically, I have a data matrix composed by species (rows) and sampling stations (columns). 具体来说,我有一个由物种(行)和采样站(列)组成的数据矩阵。 I want to perform a CA in order to group my sampling stations according to my species abundance (which I have previously log(x+1) transformed). 我想执行一个CA,以便根据我的物种丰度(我之前记录的log(x + 1))对我的采样站进行分组。

Once having prepared adequately my matrix,I've tried to run a CA according to the pvclust package, using Ward's clustering method and Bray-Curtis as distance index. 一旦充分准备好我的矩阵,我就尝试根据pvclust包运行CA,使用Ward的聚类方法和Bray-Curtis作为距离索引。 However, every time I get the following error message: 但是,每次我收到以下错误消息:

''Error in hclust(distance, method = method.hclust) : invalid clustering method'' ''hclust中的错误(distance,method = method.hclust):无效的聚类方法''

I then tried to perform the same analysis using another cluster method, and I had no problem. 然后我尝试使用另一种集群方法执行相同的分析,我没有问题。 I also tried to perform the same analysis using the hclust function from the vegan package, and I had no problem at all, too. 我也尝试使用纯素包中的hclust函数执行相同的分析,我也没有任何问题。 The analysis run without any problems. 分析运行没有任何问题。

To better understand my problem, I'll display part of my matrix and the script that I used to perfrom the analysis: 为了更好地理解我的问题,我将显示我的矩阵的一部分和我用于分析的脚本:

          P1        P2         P3         P4         P5       P6
1  10.8750000 3.2888889  2.0769231  1.4166667  3.2395833 5.333333
3   0.3645833 0.3027778  0.3212038  0.7671958  0.4993676 0.000000
4   0.0000000 0.0000000  2.3500000  0.0000000  0.0000000 0.264000
5   0.0000000 0.7333333  0.2692308  0.0000000  0.2343750 0.000000
6   0.0000000 0.9277778  0.0000000  0.2936508  0.7291667 0.000000
7   0.4166667 6.3500000  1.0925463  0.5476190  0.1885169 0.000000
8   1.6250000 0.0000000  0.0000000  0.0000000  5.2187500 0.000000
9   0.0000000 0.8111111  0.0000000  0.0000000  0.0000000 0.000000
10  2.6770833 0.6666667  2.3304890  4.5906085  2.9652778 0.000000
15  1.8020833 0.9666667  1.4807137  3.3878968  0.1666667 0.000000
16 17.8750000 4.9555556  1.4615385  6.5000000  7.8593750 7.666667
19  4.5312500 1.0555556  3.5766941  6.7248677  2.3196181 0.000000
20  0.0000000 0.6777778  0.5384615  0.0000000  0.0000000 0.000000
21  0.0000000 0.9777778  0.0000000  0.2500000  0.0000000 0.000000
24  1.2500000 3.0583333  0.1923077  0.0000000  4.9583333 0.000000
25  0.0000000 0.0000000  2.5699634  0.0000000  0.0000000 0.000000
26  6.6666667 2.2333333 24.8730020 55.9980159 17.6239583 0.000000

Where P1-P6 are my sampling stations, and the leftmost row numbers are my different species. P1-P6是我的采样站,最左边的行号是我的不同物种。 I'll denote this example matrix just as ''platforms''. 我将这个示例矩阵表示为“平台”。

Afterwards, I've used the following code lines: 之后,我使用了以下代码行:

dist <- function(x, ...){
  vegdist(x, ...)
}

result<-pvclust(platforms,method.dist = "bray",method.hclust = "ward")

It is noteworthy that I run the three first codelines, since the bray-curtis index isn't originally available in the pvclust package. 值得注意的是,我运行了三个第一个代码行,因为bray-curtis索引最初在pvclust包中不可用。 Thus, running these codelines allowed me to specify the bray-curtis index in the pvclust function 因此,运行这些代码行允许我在pvclust函数中指定bray-curtis索引

Does anyone know why it doesn't work with the pvclust package? 有谁知道为什么它不适用于pvclust包?

Any help will be much appreciated. 任何帮助都感激不尽。

Kind regards, 亲切的问候,

Marie 玛丽

There are two related issues: 有两个相关问题:

  1. When calling method.hclust you need to pass hclust compatible methods. 当调用method.hclust你需要传递hclust兼容的方法。 In theory pvclust checks for ward and converts to ward.D , but you probably want to pass the (correct) names of either ward.D or ward.D2 . 理论上pvclust检查ward并转换为ward.D ,但你可能想传递ward.Dward.D2的(正确)名称。
  2. You cannot over-write dist in that fashion. 你不能以那种方式覆盖dist However, you can pass a custom function to pvclust . 但是,您可以将自定义函数传递给pvclust

For instance, this should work: 例如,这应该工作:

library(vegan)
library(pvclust)

sample.data <- "P1  P2  P3  P4  P5  P6
10.8750000  3.2888889   2.0769231   1.4166667   3.2395833   5.3333330
0.3645833   0.3027778   0.3212038   0.7671958   0.4993676   0.0000000
0.0000000   0.0000000   2.3500000   0.0000000   0.0000000   0.2640000
0.0000000   0.7333333   0.2692308   0.0000000   0.2343750   0.0000000
0.0000000   0.9277778   0.0000000   0.2936508   0.7291667   0.0000000
0.4166667   6.3500000   1.0925463   0.5476190   0.1885169   0.0000000
1.6250000   0.0000000   0.0000000   0.0000000   5.2187500   0.0000000
0.0000000   0.8111111   0.0000000   0.0000000   0.0000000   0.0000000
2.6770833   0.6666667   2.3304890   4.5906085   2.9652778   0.0000000
1.8020833   0.9666667   1.4807137   3.3878968   0.1666667   0.0000000
17.8750000  4.9555556   1.4615385   6.5000000   7.8593750   7.6666670
4.5312500   1.0555556   3.5766941   6.7248677   2.3196181   0.0000000
0.0000000   0.6777778   0.5384615   0.0000000   0.0000000   0.0000000
0.0000000   0.9777778   0.0000000   0.2500000   0.0000000   0.0000000
1.2500000   3.0583333   0.1923077   0.0000000   4.9583333   0.0000000
0.0000000   0.0000000   2.5699634   0.0000000   0.0000000   0.0000000
6.6666667   2.2333333   24.8730020  55.9980159  17.6239583  0.0000000"

platforms <- read.table(text = sample.data, header = TRUE)

result <- pvclust(platforms, 
                  method.dist = function(x){
                    vegdist(x, "bray")
                  },
                  method.hclust = "ward.D")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM