简体   繁体   English

R素食主义者的物种分数与Primer Spearman物种与轴的等级相关性有什么区别?

[英]What is the difference between R vegan species-scores, and Primer Spearman rank correlation of species to axis?

In constrained ordination analysis, like CAP or dbRDA , researchers often want to know how much of the dissimilarity is attributed to specific species. 在诸如CAPdbRDA类的受约束排序分析中,研究人员通常想知道多少差异是由特定物种引起的。 In Primer PERMANOVA , Spearman rank or Pearson correlations of species to the axis is an option to that provides an estimate of the species that characterise the variation between assemblages of species when using CAP or RDA. Primer PERMANOVA ,物种与轴的Spearman rankPearson correlations是一种选择,当使用CAP或RDA时,该估计可提供表征物种组合之间变异的物种估计。 In R, vegan provides a different measurement, known as species scores, which can be calculated but not without careful consideration of the potential shortcomings https://github.com/vegandevs/vegan/issues/254#issuecomment-334071917 . 在R中, vegan提供了一种不同的衡量标准,即物种得分,可以计算出来, 但必须仔细考虑潜在的缺点 https://github.com/vegandevs/vegan/issues/254#issuecomment-334071917 and vegan dbrda species scores are empty despite community matrix provided , when using capscale. 尽管使用capscale, 尽管提供了社区矩阵但纯素食主义者dbrda物种得分还是空的

I would like to better understand how the correlations and species scores are calculated in Primer PERMANOVA . 我想更好地了解Primer PERMANOVA如何计算相关性和物种分数。 Firstly, is there a real difference in what these methods aim to show? 首先,这些方法旨在显示什么真正的区别? What are the benefits and shortcomings of the using Spearman or Pearson correlations over R- vegan 's species scores? 使用SpearmanPearson correlations对于R- vegan物种评分有什么好处和缺点? Does the method of calculating the species-to-axis correlations in Primer suffer from similar problems as detailed in the above links for species scores in capscale or dbrda in R? 在Primer中计算物种与轴之间的相关性的方法是否遇到上述问题中有关R中capscaledbrda物种得分的类似问题? In Primer, what are the variables used from the community matrix and axis to calculate the correlations between them? 在Primer中,社区矩阵和坐标轴用于计算它们之间的相关性的变量是什么? Are these raw or the transformed data? 这些是原始数据还是转换后的数据? And finally, if the correlation method is a better estimate of the relative amount by which species cause differences between assemblages than species scores in R, should this be considered as an alternative to R vegan' species scores? 最后,如果相关方法比R中的物种得分更好地估计了物种引起组合之间差异的相对量,那么这是否可以视为R素食主义者物种得分的替代选择?

I have never seen PRIMER and I cannot comment on differences between vegan and PRIMER , but I can explain how we work in vegan . 我从未见过PRIMER ,也无法评论纯素食主义者PRIMER之间的区别,但我可以解释一下我们在纯素食主义者中的工作方式。

If you think about species scores of fitted environmental variables as arrows, there are two separate aspects: direction and length. 如果将拟合的环境变量的物种得分视为箭头,则有两个不同的方面:方向和长度。 First about direction of the arrow. 首先关于箭头的方向。

In general, the arrows are not parallel to the axes, but they point to the direction to which the species or the environmental variable changes most rapidly. 通常,箭头不平行于轴,但它们指向物种或环境变量变化最快的方向。 The directions of arrows can be found from linear model lm(y ~ Ax1 + Ax2) . 箭头的方向可以从线性模型lm(y ~ Ax1 + Ax2) If y is a species, this gives the arrow for the species score, and if y is an environmental variable, this gives a fitted vector. 如果y是一个物种,则给出物种得分的箭头;如果y是环境变量,则给出拟合的向量。 Correlating species with axes implies two separate models lm(y ~ Ax1) and lm(y ~ Ax2) . 将物种与轴相关联意味着两个单独的模型lm(y ~ Ax1)lm(y ~ Ax2) The vegan model defines a linear trend surface, and the axis model defines two separate linear trend surfaces with each having steepest gradient along one axis. 素食主义者模型定义线性趋势面,轴模型定义两个单独的线性趋势面,每个线性趋势面沿一个轴的坡度最大。 The following example shows how the linear model related to species scores in PCA in vegan : 以下示例显示了线性模型与素食主义者 PCA中物种得分的关系:

library(vegan)
data(varespec)
pc <- rda(varespec)
biplot(pc) # species scores as biplot arrows
plot(envfit(pc ~ Pleuschr + Cladarbu + Cladrang + Cladstel, varespec))
ordisurf(pc ~ Cladstel, varespec, knots = 1, add = TRUE)

The envfit function adds arrows that point to the same direction as species scores, and ordisurf adds linear ( knots = 1 ) trend surface to Cladstel . 所述envfit函数将指向相同的方向物种的分数箭头和ordisurf增加线性( knots = 1 )趋势面到Cladstel The isoclines of the linear trend surface are equally spaced and perpendicular to the arrow. 线性趋势面的等角线等距且垂直于箭头。 Projecting sampling units to the arrow gives the predicted species abundance in this two-dimensional solution. 将采样单位投影到箭头会在此二维解决方案中给出预测的物种丰度。 The interpretation of species scores is similar in RDA, but there you must remember to use linear combination scores ( display="lc" ), and in CCA you must remember to use weighted regression ( envfit and ordisurf take care of that automatically, but with lm or other non- vegan tools you need explicit weights). 在RDA中,物种得分的解释类似,但是您必须记住使用线性组合得分( display="lc" ),而在CCA中,您必须记住使用加权回归( envfitordisurf会自动进行加权,但是lm或其他非素食工具,则需要明确的权重)。

This approach is not easily changed to use rank correlations. 这种方法不容易更改为使用秩相关。 For ranks, you need to project points (sampling units) to a univariate sequence. 对于等级,您需要将点(采样单位)投影到单变量序列。 Often people project on axes (that is, they correlate axes with species). 人们通常在轴上投射(即,他们将轴与物种相关联)。 However, better correlation will be found if we find a line through origin that gives the best rank correlation when sampling units are projected onto it for ranks (if a unique line, or a sector containing lines, exists). 但是,如果我们通过原点找到一条线,当将采样单位投影到等级上时(如果存在唯一的线或包含线的扇区),则可以找到更好的等级相关性。 This would be similar to our approach of finding the direction of steepest change in linear trend surface. 这将类似于我们在线性趋势面中找到最大变化方向的方法。 This is easily done with Euclidean space, like all vegan ordination spaces are, but not with ranks of projections. 像所有素食主义者协调空间一样,这是使用欧几里得空间很容易完成的,但是不使用投影等级。

The assumption of linear trend surface is pretty simplistic. 线性趋势面的假设非常简单。 It is the model for species in PCA and RDA, and it is the model for constraints in RDA and there it shows how the analysis sees your data (remember "lc" scores!). 它是PCA和RDA中物种的模型,并且是RDA中约束的模型,它在那里显示了分析如何看待您的数据(请记住"lc"得分!)。 However, with other variables and with other ordination methods, more complicated response models are often more adequate. 但是,对于其他变量和其他排序方法,更复杂的响应模型通常更合适。 These can be fit[ted] using ordisurf with knots > 1. 可以使用ordisurf knots > 1来拟合它们。

Then about lengths of the arrows, or distances of species scores from the origin. 然后大约箭的长度,或物种从原点的距离得分。 The envfit() function finds the correct direction, but it scales the arrow length by correlation coefficient. envfit()函数可以找到正确的方向,但是可以通过相关系数来缩放箭头的长度。 In PCA and RDA, we have several alternative scaling options: see the long description of scaling and correlation in ?scores.cca . 在PCA和RDA中,我们有几个替代的缩放选项:请参见?scores.cca有关scalingcorrelation详细说明。 The default ( correlation = FALSE ) scaling makes arrows proportional to absolute change in species abundance. 默认的( correlation = FALSE )缩放比例使箭头与物种丰度的绝对变化成比例。 That is, abundant species can change a lot and can have long arrows, but scarce species can change only a bit and have always short arrows. 也就是说,丰富的物种可以改变很多,并且可以具有长箭头,而稀有物种可以只改变一点并且总是具有短箭头。 It is absolute change, not relative change. 这是绝对的变化,而不是相对的变化。 With correlation = TRUE , the arrow lengths are proportional to relative change and will be similar to scaling by correlations used in envfit . 如果correlation = TRUE ,则箭头的长度与相对变化成比例,并且将类似于envfit使用的相关关系进行缩放。 Again, study the manual for details ( ?scores.cca ). 同样,请阅读手册以获取详细信息( ?scores.cca )。

Please see below a very useful reply from Marti Anderson in response to an email relating to this question above - Copied with her permission: 请在下方看到Marti Anderson回复有关上述问题的电子邮件的非常有用的回复-已获得她的许可:

Dear Philip, 亲爱的菲利普,

Thanks for your message and interest in all of these methods and their interpretation. 感谢您对所有这些方法及其解释的信息和关注。 I won't comment on the R-based stuff as implemented in vegan, as no doubt Jari is best able to do that, but I would be happy to comment on the raw rank (or Perason) correlation vectors implemented in PRIMER/PERMANOVA software. 我不会评论在纯素食主义者中实现的基于R的东西,因为毫无疑问,Jari能够做到这一点,但是我很乐于评论在PRIMER / PERMANOVA软件中实现的原始排名(或Perason)相关向量。 First, let me emphasise that these are intended to be an exploratory tool. 首先,让我强调一下,这些旨在用作探索工具。 Quite simply, all they do is show the raw (or multiple) correlations of individual species (or other variables) with the ordination axes. 很简单,它们所做的只是显示单个物种(或其他变量)与排序轴的原始(或多个)相关性。 Two important caveats to their use or important to note here. 需要注意两个重要的注意事项,或在此处需要注意。 First, they most emphatically do not tell you anything directly about the specific role that individual species may have played in driving patterns seen in the dbRDA (or nMDS or PCO or whatever ordination axes you are talking about). 首先,他们最强调的是没有直接告诉您任何物种在dbRDA(或nMDS或PCO或您所谈论的任何协调轴)中所看到的驱动模式中所起的特定作用。 They cannot do that because they are being drawn after the fact, and, as we know, the relationships between the original species variables and the ordination axes, in very many cases, are not expected to be linear (or even monotonic) at all. 它们之所以不能这样做,是因为它们是根据事实绘制的,而且众所周知,原始物种变量与排序轴之间的关系在很多情况下根本就不会是线性的(甚至是单调的)。 The ordination is being (generally) done in the space of some chosen dissimilarity measure (such as BC or Jaccard, etc.). (通常)在某些选定的差异度量(例如BC或Jaccard等)的空间中进行排序。 This is appropriate for lots of reasons (which I am sure you already know and I won't go in to). 出于多种原因,这是适当的(我确信您已经知道,并且我不会继续讨论)。 To see what the patterns are of individual species in a constrained ordination space like dbRDA (or any other ordination), I suggest you use bubble plots, which re a much more refined tool for visualising the patterns. 要查看诸如dbRDA(或任何其他排序)之类的受约束排序空间中单个物种的模式,我建议您使用气泡图,它是一种更加精细的工具,可用于可视化模式。 (And incidentally, in PRIMER 7, you can superimpose segmented bubble plots to visualise patterns for a set of species simultaneously in this way. The advantage of bubble plots is that they are able, for any given species, to show any pattern that might be occurring, including step-changes across groups, potential interactions across factors, unimodal or multi-modal patterns along gradients, etc. Now, the use of rank (or Pearson) correlation vectors have the advantage of being able to plot lots of species at once, but they are clearly a blunt tool by comparison, as they can only show well those species that may have increasing or decreasing relationships with the ordination axes, which if often not the whole story. They have their uses, however, particularly in the CAP setting, where the axes have been drawn specifically to maximise group differences. In this case, the groups are separated on the plot, so the species having longer axes drawn as increasing or decreasing across such plots (顺便说一下,在PRIMER 7中,您可以通过这种方式叠加分段的气泡图以同时可视化一组物种的模式。气泡图的优势在于,对于任何给定的物种,气泡图都能够显示可能是发生,包括组间的阶跃变化,跨因子的潜在相互作用,沿梯度的单峰或多峰模式等。现在,使用秩(或Pearson)相关向量具有能够一次绘制大量物种的优势。 ,但与之相比,它们显然是一个钝器,因为它们只能很好地显示那些与排序轴可能具有增加或减少的关系的物种,而这些坐标通常不是全部,但它们的用途尤其是在CAP中设置,其中专门绘制了轴以最大化组差异。在这种情况下,组在图上是分开的,因此具有较长轴的物种在这些图上绘制为增加或减少 will generally correspond to those species that characterise the group differences. 通常将对应于表征群体差异的那些物种。 However, even in the CAP example, sometimes the group differences are caused by changes in composition in whole sets of (minor) species acting together multivariately and simultaneously through the dissimilarity measure, and this might not be easily picked up by singular patterns for any of those species when considered independently and individually. 但是,即使在CAP示例中,有时组差异也是由于通过不相似性度量同时(多变量)同时作用的整个(次要)物种集合的成分变化引起的,对于任何一个这些物种在独立和单独考虑时。

Well, I do hope that I have helped to clarify the issues here a little. 好吧,我确实希望我能在这里帮助澄清一些问题。 The upshot of all of the above is that it is no problem to put these vectors onto any ordination plot you may choose, but they are a heuristic and posterior exercise, rather than a definitive or diagnostic tool in the dissimilarity-based kind of setting, and they only show a certain kind of relationship. 上面所有这些的结果是,将这些向量放到您可能选择的任何排序图上都没有问题,但是它们是一种启发式的和后验的练习,而不是基于差异的那种环境中的确定性或诊断性工具,他们只表现出某种关系。 Other tools (such as bubble or segmented bubble plots) will be richer in their information content regarding individual species patterns. 其他工具(例如气泡图或分段气泡图)将具有更丰富的有关单个物种模式的信息内容。

I hope the above is helpful! 希望以上内容对您有所帮助! Kind regards, Marti Anderson 亲切的问候,玛蒂·安德森

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM