[英]Trying to simulate Poisson samples using inverse CDF method but my R function produces wrong results
I wrote some R code for simulating random samples from a Poisson distribution, based on the description of an algorithm (see attached image).我写了一些R代码来模拟泊松分布中的随机样本,基于算法的描述(见附图)。 But my code does not seem to work correctly, because the generated random samples are of a different pattern compared with those generated by R 's built-in
rpois()
function.但是我的代码似乎无法正常工作,因为生成的随机样本与R的内置
rpois()
函数生成的随机样本具有不同的模式。 Can anybody tell me what I did wrong and how to fix my function?谁能告诉我我做错了什么以及如何修复我的功能?
r.poisson <- function(n, l=0.5)
{
U <- runif(n)
X <- rep(0,n)
p=exp(-l)
F=p
for(i in 1:n)
{
if(U[i] < F)
{
X[i] <- i
} else
{
p=p*l/(i+1)
F=F+p
i=i+1
}
}
return(X)
}
r.poisson(50)
The output is very different from rpois(50, lambda = 0.5)
.输出与
rpois(50, lambda = 0.5)
非常不同。 The algorithm I followed is:我遵循的算法是:
(Thank you for your question. Now I know how a Poisson random variable is simulated.) (谢谢您的提问。现在我知道如何模拟泊松随机变量了。)
You had a misunderstanding.你误会了。 The inverse CDF method (with recursive computation) you referenced is used to generate a single Poisson random sample.
您引用的逆 CDF 方法(使用递归计算)用于生成单个泊松随机样本。 So you need to fix this function to produce a single number.
所以你需要修复这个函数来产生一个数字。 Here is the correct function, commented to help you follow each step.
这是正确的功能,注释以帮助您遵循每个步骤。
rpois1 <- function (lambda) {
## step 1
U <- runif(1)
## step 2
i <- 0
p <- exp(-lambda)
F <- p
## you need an "infinite" loop
## no worry, it will "break" at some time
repeat {
## step 3
if (U < F) {
X <- i
break
}
## step 4
i <- i + 1
p <- lambda * p * i
F <- F + p
## back to step 3
}
return(X)
}
Now to get n
samples, you need to call this function n
times.现在要获取
n
样本,您需要调用此函数n
次。 R has a nice function called replicate
to repeat a function many times. R有一个很好的函数,称为
replicate
,可以多次重复一个函数。
r.poisson <- function (n, lambda) {
## use `replicate()` to call `rpois1` n times
replicate(n, rpois1(lambda))
}
Now we can make a reasonable comparison with R 's own rpois
.现在我们可以与R自己的
rpois
进行合理的比较。
x1 <- r.poisson(1000, lambda = 0.5)
x2 <- rpois(1000, lambda = 0.5)
## set breaks reasonably when making a histogram
xmax <- max(x1, x2) + 0.5
par(mfrow = c(1, 2))
hist(x1, main = "proof-of-concept-implementation", breaks = seq.int(-0.5, xmax))
hist(x2, main = "R's rpois()", breaks = seq.int(-0.5, xmax))
A vectorized version will run much faster than a non-vectorized function using replicate
.矢量化版本将比使用
replicate
的非矢量化函数运行得快得多。 The idea is to iteratively drop the uniform random samples as i
is incremented.这个想法是随着
i
的增加迭代地丢弃均匀的随机样本。
r.poisson1 <- function(n, l = 0.5) {
U <- runif(n)
i <- 0L
X <- integer(n)
p <- exp(-l)
F <- p
idx <- 1:n
while (length(idx)) {
bln <- U < F
X[idx[bln]] <- i
p <- l*p/(i <- i + 1L)
F <- F + p
idx <- idx[!bln]
U <- U[!bln]
}
X
}
@Zheyuan Li's non-vectorized functions: @Zheyuan Li 的非向量化函数:
rpois1 <- function (lambda) {
## step 1
U <- runif(1)
## step 2
i <- 0
p <- exp(-lambda)
F <- p
## you need an "infinite" loop
## no worry, it will "break" at some time
repeat {
## step 3
if (U < F) {
X <- i
break
}
## step 4
i <- i + 1
p <- lambda * p * i
F <- F + p
## back to step 3
}
return(X)
}
r.poisson2 <- function (n, lambda) {
## use `replicate()` to call `rpois1` n times
replicate(n, rpois1(lambda))
}
Benchmark:基准:
microbenchmark::microbenchmark(r.poisson1(1e5),
r.poisson2(1e5, 0.5),
rpois(1e5, 0.5))
#> Unit: milliseconds
#> expr min lq mean median uq max neval
#> r.poisson1(1e+05) 3.063202 3.129151 3.782200 3.225402 3.734600 18.377700 100
#> r.poisson2(1e+05, 0.5) 217.631002 244.816601 269.692648 267.977001 287.599251 375.910601 100
#> rpois(1e+05, 0.5) 1.519901 1.552300 1.649026 1.579551 1.620451 7.531401 100
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.