简体   繁体   English


[英]how to interpret cca vegan output

I have performed a canonical correspondece analysis in R using the vegan package but i find the output very difficult to understand. 我使用纯素包在R中进行了规范的对应分析,但我发现输出很难理解。 The triplot is understandable, but all the numbers I get from the summary(cca) are confusing to me (as i've just started to learn about ordination techniques) I would like to know how much of the variance in Y that is explained by X (in this case, the environmental variables) and which of the independent variables that are important in this model? triplot是可以理解的,但是我从摘要中得到的所有数字(cca)都让我感到困惑(因为我刚开始学习圣职任命技术)我想知道Y中的多少方差是由X(在这种情况下,环境变量)以及哪个独立变量在此模型中很重要?

my output looks like this: 我的输出看起来像这样:

Partitioning of mean squared contingency coefficient:
              Inertia Proportion
Total           4.151     1.0000
Constrained     1.705     0.4109
Unconstrained   2.445     0.5891

Eigenvalues, and their contribution to the mean squared contingency coefficient 

Importance of components:
                        CCA1   CCA2    CCA3    CCA4    CCA5    CCA6      CCA7
Eigenvalue            0.6587 0.4680 0.34881 0.17690 0.03021 0.02257 0.0002014
Proportion Explained  0.1587 0.1127 0.08404 0.04262 0.00728 0.00544 0.0000500
Cumulative Proportion 0.1587 0.2714 0.35548 0.39810 0.40538 0.41081 0.4108600

                         CA1    CA2     CA3     CA4     CA5     CA6     CA7
Eigenvalue            0.7434 0.6008 0.36668 0.33403 0.28447 0.09554 0.02041
Proportion Explained  0.1791 0.1447 0.08834 0.08047 0.06853 0.02302 0.00492
Cumulative Proportion 0.5900 0.7347 0.82306 0.90353 0.97206 0.99508 1.00000

Accumulated constrained eigenvalues

Importance of components:
                        CCA1   CCA2   CCA3   CCA4    CCA5    CCA6      CCA7
Eigenvalue            0.6587 0.4680 0.3488 0.1769 0.03021 0.02257 0.0002014
Proportion Explained  0.3863 0.2744 0.2045 0.1037 0.01772 0.01323 0.0001200
Cumulative Proportion 0.3863 0.6607 0.8652 0.9689 0.98665 0.99988 1.0000000

Scaling 2 for species and site scores
* Species are scaled proportional to eigenvalues
* Sites are unscaled: weighted dispersion equal on all dimensions

Species scores

                 CCA1     CCA2    CCA3      CCA4      CCA5       CCA6
S.marinoi     -0.3890  0.39759  0.1080 -0.005704 -0.005372 -0.0002441
C.tripos       1.8428  0.23999 -0.1661 -1.337082  0.636225 -0.5204045
P.alata        1.6892  0.17910 -0.3119  0.997590  0.142028  0.0601177
P.seriata      1.4365 -0.15112 -0.8646  0.915351 -1.455675 -1.4054078
D.confervacea  0.2098 -1.23522  0.5317 -0.089496 -0.034250  0.0278820
C.decipiens    2.2896  0.65801 -1.0315 -1.246933 -0.428691  0.3649382
P.farcimen    -1.2897 -1.19148 -2.3562  0.032558  0.104148 -0.0068910
C.furca        1.4439 -0.02836 -0.9459  0.301348 -0.975261  0.4861669

Biplot scores for constraining variables

                CCA1    CCA2     CCA3     CCA4     CCA5     CCA6
Temperature  0.88651  0.1043 -0.07283 -0.30912 -0.22541  0.24771
Salinity     0.32228 -0.3490  0.30471  0.05140 -0.32600  0.44408
O2          -0.81650  0.4665 -0.07151  0.03457  0.20399 -0.20298
Phosphate    0.22667 -0.8415  0.41741 -0.17725 -0.06941 -0.06605
TotP        -0.33506 -0.6371  0.38858 -0.05094 -0.24700 -0.25107
Nitrate      0.15520 -0.3674  0.38238 -0.07154 -0.41349 -0.56582
TotN        -0.23253 -0.3958  0.16550 -0.25979 -0.39029 -0.68259
Silica       0.04449 -0.8382  0.15934 -0.22951 -0.35540 -0.25650

Which of all these numbers are important to my analysis? 所有这些数字对我的分析都很重要? /anna /安娜

How much variation is explained by X ? X解释了多少变化?

In a CCA, variance isn't variance in the normal sense. 在CCA中,方差不是正常意义上的方差。 We express it as the "mean squared contingency coefficient", or "inertia". 我们将其表示为“均方应变系数”或“惯性”。 All the info you need to ascertain how much "variation" in Y is explained by X is contained in the section of the output that I reproduce below: 您需要的所有信息,以确定Y中的“变化”由X解释,包含在我在下面重现的输出部分中:

Partitioning of mean squared contingency coefficient:
              Inertia Proportion
Total           4.151     1.0000
Constrained     1.705     0.4109
Unconstrained   2.445     0.5891

In this example there is total inertia 4.151 and your X variables (these are "Constraints") explain a total of 1.705 bits of inertia, which is about 41%, leaving about 59% unexplained. 在这个例子中,总惯性4.151和你的X变量(这些是“约束”)解释了总共1.705位的惯性,约为41%,剩下约59%的原因不明。

The next section referring to eigenvalues allows you to see both in terms of inertia explained and proportion explained which axes contribute significantly to the explanatory "power" of the CCA (the Constrained part of the table above) and the unexplained "variance" (the Unconstrained part of the table above. 下一节涉及特征值允许您在惯性解释和比例解释中看到哪些轴对CCA(上表的Constrained部分)的解释性“权力”和未解释的“方差”(无Unconstrained )有显着贡献。上表的一部分。

The next section contains the ordination scores. 下一节包含排序分数。 Think of these as the coordinates of the points in the triplot. 将这些视为triplot中各点的坐标。 For some reason you show the site scores in the output above, but they would normally be there. 出于某种原因,您在上面的输出中显示了网站分数,但它们通常会在那里。 Note that these have been scaled - by default this is using scaling = 2 - so site points are at their weighted average of the species scores IIRC etc. 请注意,这些已经缩放 - 默认情况下,这是使用scaling = 2 - 所以站点点位于物种分数IIRC等的加权平均值。

The "Biplot" scores are the locations of the arrow heads or the labels on the arrows - I forget exactly how the plot is drawn now. “Biplot”分数是箭头的位置或箭头上的标签 - 我完全忘记了如何绘制情节。

Which of all these numbers are important to my analysis? 所有这些数字对我的分析都很重要?

All of them are important - if you think the triplot is important an interpretable, it is based entirely on the information reported by summary() . 所有这些都很重要 - 如果你认为triplot是重要的可解释的,它完全基于summary()报告的信息。 If you have specific questions to ask of the data, then perhaps only certain sections will be of paramount importance to you. 如果您对数据有特定的问题,那么可能只有某些部分对您来说至关重要。

However, StackOverflow is not the place to ask such questions of a statistical nature. 但是,StackOverflow不是提出统计性问题的地方。

I don't have the ability to comment. 我没有评论的能力。 But in response to the first answers interpretation to the first answers interpretation on the species and site scores in scaling 2, I believe their explanation is backwards. 但是,为了回应第一个答案解释第一个答案解释物种和站点得分2,我相信他们的解释是倒退。

In the book "Numerical Ecology with R" by Borcard, Gillet, and Legendre they clearly state that in scaling 2 species scores are weighted averages of the sites. 在Borcard,Gillet和Legendre的“带有R的数字生态学”一书中,他们明确指出,在缩放2种物种时,得分是这些地点的加权平均值。

This can be confirmed when using the ordihull funtion in CCA. 在CCA中使用ordihull功能时可以确认这一点。

Also in the output from the OP states that species scores are scaled and site scores are unscaled. 同样在OP的输出中,物种得分被缩放并且站点得分未被缩放。 which I believe confirms what the book says. 我相信这证实了这本书的内容。

"Scaling 2 for species and site scores * Species are scaled proportional to eigenvalues * Sites are unscaled: weighted dispersion equal on all dimensions" “物种和站点得分的缩放2 *物种的比例与特征值成比例*站点未缩放:加权扩散在所有维度上相等”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM