简体   繁体   English

R仿真中的参数估计

[英]Parameter Estimation in R Simulation

I am fairly new to R and is exploring simulation to estimate the parameter n (integer) 我对R相当陌生,并且正在探索仿真以估计参数n(整数)

1) Z is a vector of n draws from N(0,1) 1)Z是从N(0,1)开始的n个绘制的向量

2) Probability of max(Z)>4 equals 0.25 2)max(Z)> 4的概率等于0.25

What is the best way in R to estimate the parameter n to satisfy these two conditions? R中估计参数n以满足这两个条件的最佳方法是什么? I got stuck when trying to avoid looping or exhaustive search in the code. 尝试避免代码中的循环或穷举搜索时,我陷入了困境。 Thanks! 谢谢!

Edit: assuming a totally simulation based result, with no attempts to do this analytically, 编辑:假定完全基于模拟的结果,而没有尝试进行分析,

I'd create a function like this: 我将创建一个像这样的函数:

prob <- function(n) {
  sum(replicate(10000, max(rnorm(n))) > 4)/10000
}

To explain that a bit, max(rnorm(n))) > 4 will return a TRUE or FALSE . 为了说明一点, max(rnorm(n))) > 4将返回TRUEFALSE The call to replicate performs that operation 10000 times. replicate调用执行该操作10000次。 Then I average to get a probability estimate. 然后我求平均值以获得概率估计。

Then I would check out the ?optimise function, to try and get an estimate for n . 然后,我将检查?optimise函数,以尝试获取n的估计值。 You'd need to create another function which has a minima when prob(n) = 0.25 , so something like: prob(n) = 0.25 ,您需要创建另一个具有最小值的函数,因此类似:

result <- function(n) abs(prob(n) - 0.25) . result <- function(n) abs(prob(n) - 0.25)

Note, depending on how you pick your parameters this could take a long time to run. 请注意,根据您选择参数的方式,这可能需要很长时间才能运行。 Test things out first to see what values for n might be reasonable. 首先测试一下,看看n值可能是合理的。

Here is another (related) way, that takes advantage of pnorm which gives you the CDF for N(0,1). 这是另一种(相关的)方法,它利用pnorm来为您提供N(0,1)的CDF。 So pnorm(4) tells you the probability that a draw from N(0,1) <= 4 and in consequence 1 - pnorm(4) will tell us the probability that a draw is greater than 4. If any draw is greater than 4, than obviously the max is greater than 4, so we just need to concentrate on the probability that some observation is greater than 4. 因此pnorm(4)告诉您从N(0,1) <= 4绘制的概率,结果是1 - pnorm(4)将告诉我们绘制的概率大于4。 4,显然,最大值大于4,因此我们只需要关注某个观察值大于4的概率。

Since draws are independent we can take products, so the probability of a draw greater than 4 in n draws is 1 - (pnorm(4)^n) . 由于抽奖是独立的,我们可以取积,因此n次抽奖中抽奖大于4的概率为1 - (pnorm(4)^n) Based on this we can create on objective function and solve: 基于此,我们可以创建目标函数并求解:

# Minimize squared deviations
fopt <- function(n){(1 - pnorm(4)^n - .25)^2} 
# or .75 - pnorm(4)^n, but this is clearer

# I specify start and end points. We guess really wide
optimise(fopt, interval = c(100, 100000))
#> $minimum
#> [1] 9083.241
#> 
#> $objective
#> [1] 2.262374e-20

# Now check the result
(1 - pnorm(4)^9083.241)
#> [1] 0.25

We see we get a result of 9083.241 which evaluates to exactly .25. 我们看到我们得到9083.241的结果,该结果精确地为.25。 If we only take integer results (9083) it evaluates to .2499943 如果我们仅取整数结果(9083),则结果为0.24993943

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM