简体   繁体   中英

R: min, max, mean and median of a vector within 95% confidence interval (2.5 to 97.5 percentiles)

I ran 1000 iterations to generate a normal deviate using rnorm and saved it in a vector:

rvec <- rnorm (1000, mean = 0.143927671, sd = 0.110680809)

I need to find a min, max, mean and median of a vector within 95% confidence interval (2.5 to 97.5 percentiles), are there any functions to do that in R? I was trying to use apply , but it doesn't seem to give what I want:

rmax = apply(rvec, 2, max, c(.025, 0.975))

So I want to estimate min/max/mean/median of a population based on a random sample / subset of that population

In Excel there is an AddIn for MonteCarloanalysis, but I want to do that in R.

Thank you!

One way to get a confidence interval for the median based on a sample S would be to take bootstrap resamples of S, computing the median of each sample. Let's take your example (setting the random seed for reproducibility):

set.seed(100)
rvec <- rnorm (1000, mean = 0.143927671, sd = 0.110680809)
samp.medians <- replicate(500, median(sample(rvec, length(rvec), replace=T)))
summary(samp.medians)
#    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
#  0.1327  0.1425  0.1480  0.1473  0.1505  0.1615 
quantile(samp.medians, c(0.025, 0.975))
#      2.5%     97.5% 
# 0.1377611 0.1574934 

There is a separate concept which the confidence interval around the likely quantile that the current estimate represents. For example, if you take the median of 10 samples, that is an estimate of the 50th percentile of the distribution, but it is an estimate so there is some error. To get the range of the quantiles that your estimate represents, you can use binom.test as in

binom.test(x=sum(rvec>median(rvec)),n=length(rvec),conf.level=0.95)
#> [some text omitted from the output of binom.test]
#> 95 percent confidence interval:
#>  0.4685492 0.5314508

which indicates that median(rvec) is likely within the 46.9th percentile and 53.1th percentile of the underlying distribution, with 95% confidence.

Note that bootstrapping will give you a range for which the true median of the underlying distribution is likely to fall within, but it isn't valid for biased estimates like 'min' an 'max', for which the the empirical estimates ( max(rvec) , for example) are biased. However, the above method will give you the confidence intervals for the percentiles of the distribution that your favorite statistic (min/max/median/mean, 75th percentile, etc) are likely to fall within.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM