简体   繁体   中英

Why does adonis() from vegan returns a different p-value every time it is?

The function adonis from the vegan package performs a non-parametric MANOVA, also known as PERMANOVA. The issue (nor not, maybe I just don't fully understand how the test works) is that every time I run it using the same data, I get a slightly different p-value.

Here is an example data.

dframetest <- data.frame(X = rnorm(20), Y = rnorm(20), Z = rnorm(20), Label = c(rep("A",10),rep("B",10)))

adonis(dframetest[,1:3] ~ Label, permutations = 1000, data = dframetest, method = "euclidean")

If you run adonis a few times, you will see that the p-value is almost always slightly different, though it seems like there are around 3-4 values it can be. It makes me wonder what would happen if you have data that is on the "verge" of being significant. How would you interpret the results if the returned values would look something like 0.053 , 0.047 , 0.05 ?

As @user2554330 mentions, we use permutations of data to test the test statistic. The permutations are pseudo-random, generated by functions from the permute package. If you want to get repeatable p-values, set the seed of the random number generator using set.seed() ; eg

set.seed(42)
adonis(....)
set.seed(42)
adonis(....)

will yield the same set of permutations and hence the same p-value.

The accuracy of the permutation p-value will increase as you increase the number of permutations; if you try running your example without setting the RNG seed but with permutations = 10000 , you should see less variation.

adonis does a permutation test by selecting permutations at random. You asked for 1000 random permutations, so the p-value is based on the rank of your observed test statistic among those 1000 random ones. (You get easier numbers to interpret with the default permutations = 999 ; then p=0.264 means your observed statistic is 264th from the top when included with the random ones.)

If the returned values from 3 runs were 0.053, 0.047, 0.05, then you'd know that the true p-value (obtained by enumerating every possible permutation) was around 0.05. But even if you knew that the true p-value was 0.049 vs 0.051, the conclusion should be more or less the same. There's evidence of an effect that is big enough to show up by chance only about 1 in 20 times with purely random data, unrelated to the predictor.

Sorry, I'm getting off topic here. If you want to ask about interpretation of p-values, you should probably be on Cross Validated, not Stack Overflow.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM