简体   繁体   English

素食 RDA 和双标图,删除贡献 > 10% 方差的值

[英]Vegan RDA and biplot, remove values contributing >10% of variance

I am using the vegan package to do RDA and want to plot the data using biplot.我正在使用 vegan 包来执行 RDA,并希望使用 biplot 绘制数据。 In my data I have hundreds of values.在我的数据中,我有数百个值。 What I would like to do is limit the variance explained to a set limit so in the example below to 0.1.我想做的是将解释的方差限制在一个设定的范围内,因此在下面的示例中为 0.1。 So instead of having 44 of arrows I might only have say 8所以我可能只说 8 个箭头而不是 44 个箭头

library (vegan)           # Load library
library(MASS)             # load library
data(varespec)            # Dummy data
vare.pca <- rda(varespec, scale = TRUE)              # RDA anaylsis
biplot(vare.pca, scaling = 3,display = "species")    # Plot data but includes all

## extracts the percentage##
x =(sort(round(100*scores(vare.pca, display = "sp", scaling = 0)[,1]^2, 3), decreasing = TRUE)) 
## Plot percentage    
plot(length(x):1,sort(x)) # plot rank on value of y

Any help would be appreciated :)任何帮助,将不胜感激 :)

Depending on the size of the data-set it would be possible to use either ordistep or ordiR2step to reducing the amount of "unimportant" variables in your plot (see https://www.rdocumentation.org/packages/vegan/versions/2.4-2/topics/ordistep ).根据数据集的大小,可以使用 ordistep 或 ordiR2step 来减少图中“不重要”变量的数量(参见https://www.rdocumentation.org/packages/vegan/versions/2.4 -2/topics/ordistep )。 However, these functions use step-wise selection, which need to be used cautiously.但是,这些函数使用了逐步选择,需要谨慎使用。 Step-wise selection can select your included parameters based on AIC values, R2 values or p-values.逐步选择可以根据 AIC 值、R2 值或 p 值选择您包含的参数。 It does not not select values based on the importance of these for the purpose of your question.对于您的问题,它不会根据这些值的重要性来选择值。 It also does not mean that these variables have any meaning towards organisms or biochemical interactions.这也不意味着这些变量对生物或生化相互作用有任何意义。 Nevertheless, step-wise selection can be helpful giving an idea on which parameters might be of strong influence on the overall variation in the data-set.然而,逐步选择可能有助于了解哪些参数可能对数据集中的整体变化产生强烈影响。 Simple example below.下面的简单例子。

rda0 <- rda(varespec ~1, varespec)
rda1 <- rda(varespec ~., varespec)

rdaplotp <- ordistep(rda0, scope = formula(rda1))
plot(rdaplotp, display = "species", type = "n")
text(rdaplotp, display="bp")

Thus, by using the ordistep function the number of species displayed in the plot has been greatly reduced (see Fig 1 below).因此,通过使用 ordistep 函数,图中显示的物种数量已大大减少(参见下面的图 1)。 If you want to remove more variables (which I do not suggest) an option could be to look at the output of the biplot and throw out the variables which have the least amount of correlation with the principle components (see below), but I would advise against it.如果您想删除更多变量(我不建议这样做),一个选项可能是查看双标图的输出并丢弃与主成分相关性最小的变量(见下文),但我会反对它。

sumrda <- summary(rdaplotp)
sumrda$biplot

What would be wise, is to first check which question you want to answer and see if any of the included variables could be left out on forehand.明智的做法是首先检查您要回答的问题,看看是否可以在正手中忽略任何包含的变量。 This would already reduce the amount.这已经减少了数量。 Minor edit: I am also a bit confused why you want to remove parameters strongly contributing to your captured variation.小编辑:我也有点困惑为什么要删除对捕获的变化有很大影响的参数。

图。1

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM