简体   繁体   English

使用具有密度核的蒙特卡罗方法计算均值和方差

[英]calculating mean and variance using monte carlo methods, having Density Kernel

assuming the density kernel to be equal to be:假设密度核等于: https://i.stack.imgur.com/irjRs.png , what monte carlo methods can I use to estimate the mean and variance of the destribuation? ,我可以使用哪些蒙特卡罗方法来估计分布的均值和方差?

We can use numerical methods here.我们可以在这里使用数值方法。 First of all, we create a function to represent your probability density function (though this is not yet scaled so that its integral is 1):首先,我们创建了一个函数来表示您的概率密度函数(尽管它还没有进行缩放,因此其积分为 1):

pdf <- function(x) x^2 * exp(-x^2/4)

plot(pdf, xlim = c(0, 10))

We can see that almost all of the area under the curve occurs where x < 10, so if we find the integral at, say, x = 100, we should have a very accurate scaling factor to generate a true pdf:我们可以看到曲线下几乎所有的面积都发生在 x < 10 处,所以如果我们在 x = 100 处找到积分,我们应该有一个非常准确的比例因子来生成真正的 pdf:

integrate(pdf, 0, 100)$value
#> [1] 3.544908

So now we can generate a genuine pdf:所以现在我们可以生成一个真正的pdf:

pdf <- function(x)  x^2 * exp(-x^2/4) / 3.544908

plot(pdf, xlim = c(0, 10))

Now that we have a pdf, we can create a cdf with numerical integration:现在我们有了一个 pdf,我们可以创建一个带有数值积分的 cdf:

cdf <- function(x) sapply(x, \(i) integrate(pdf, 0, i)$value)

plot(cdf, xlim = c(0, 10))

The inverse of the cdf is what we need to be able to convert a sample taken from a uniform distribution between 0 and 1 into a sample drawn from our new distribution. cdf 的倒数是我们需要能够将从 0 和 1 之间的均匀分布中抽取的样本转换为从我们的新分布中抽取的样本。 We can create an inverse function using uniroot to find where the output of our cdf matches an arbitrary number between 0 and 1:我们可以使用uniroot创建一个反函数来查找 cdf 的输出与 0 和 1 之间的任意数字匹配的位置:

inverse_cdf <- function(p) 
{
  sapply(p, function(i) {
              uniroot(function(a) {cdf(a) - i}, c(0, 100))$root
  })
}

The inverse cdf looks like this:逆 cdf 如下所示:

plot(inverse_cdf, xlim = c(0, 0.99))

We are now ready to draw a sample from our distribution:我们现在准备从我们的分布中抽取样本:

set.seed(1) # Makes this draw reproducible

x_sample <- inverse_cdf(runif(1000))

Now we can plot a histogram of our sample and ensure it matches the pdf:现在我们可以绘制样本的直方图并确保它与 pdf 匹配:

hist(x_sample, freq = FALSE)

plot(function(x) pdf(x), add = TRUE, xlim = c(0, 6))

Now we are confident that we have a sample drawn from x, we can use the sample mean and standard deviation as estimates for the distribution's mean and standard deviation:现在我们确信我们有一个从 x 中抽取的样本,我们可以使用样本均值和标准差作为分布均值和标准差的估计值:

mean(x_sample)
#> [1] 2.264438

sd(x_sample)
#> [1] 0.9625839

We can increase the accuracy of these estimates by taking larger and larger samples in our call to inverse_cdf(runif(1000)) , by increasing the 1000 to a larger number.通过将 1000 增加到更大的数字,我们可以通过在调用inverse_cdf(runif(1000))时采用越来越大的样本来提高这些估计的准确性。

Created on 2021-11-06 by the reprex package (v2.0.0)reprex 包( v2.0.0 ) 于 2021 年 11 月 6 日创建

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM