尝试使用逆 CDF 方法模拟泊松样本，但我的 R 函数会产生错误的结果

Question

I wrote some R code for simulating random samples from a Poisson distribution, based on the description of an algorithm (see attached image).我写了一些R代码来模拟泊松分布中的随机样本，基于算法的描述（见附图）。 But my code does not seem to work correctly, because the generated random samples are of a different pattern compared with those generated by R 's built-in rpois() function.但是我的代码似乎无法正常工作，因为生成的随机样本与R的内置rpois()函数生成的随机样本具有不同的模式。 Can anybody tell me what I did wrong and how to fix my function?谁能告诉我我做错了什么以及如何修复我的功能？

r.poisson <- function(n, l=0.5)
{
  U <- runif(n)
  X <- rep(0,n)
  p=exp(-l)
  F=p
  for(i in 1:n)
  {
    if(U[i] < F)
    {
      X[i] <- i
    } else
    {
      p=p*l/(i+1)
      F=F+p
      i=i+1
    }
  }
  return(X)
}

r.poisson(50)

The output is very different from rpois(50, lambda = 0.5) .输出与rpois(50, lambda = 0.5)非常不同。 The algorithm I followed is:我遵循的算法是：

Answer 1

(Thank you for your question. Now I know how a Poisson random variable is simulated.) （谢谢您的提问。现在我知道如何模拟泊松随机变量了。）

You had a misunderstanding.你误会了。 The inverse CDF method (with recursive computation) you referenced is used to generate a single Poisson random sample.您引用的逆 CDF 方法（使用递归计算）用于生成单个泊松随机样本。 So you need to fix this function to produce a single number.所以你需要修复这个函数来产生一个数字。 Here is the correct function, commented to help you follow each step.这是正确的功能，注释以帮助您遵循每个步骤。

rpois1 <- function (lambda) {
  ## step 1
  U <- runif(1)
  ## step 2
  i <- 0
  p <- exp(-lambda)
  F <- p
  ## you need an "infinite" loop
  ## no worry, it will "break" at some time
  repeat {
    ## step 3
    if (U < F) {
      X <- i
      break
    }
    ## step 4
    i <- i + 1
    p <- lambda * p * i
    F <- F + p
    ## back to step 3
  }
  return(X)
}

Now to get n samples, you need to call this function n times.现在要获取n样本，您需要调用此函数n次。 R has a nice function called replicate to repeat a function many times. R有一个很好的函数，称为replicate ，可以多次重复一个函数。

r.poisson <- function (n, lambda) {
  ## use `replicate()` to call `rpois1` n times
  replicate(n, rpois1(lambda))
}

Now we can make a reasonable comparison with R 's own rpois .现在我们可以与R自己的rpois进行合理的比较。

x1 <- r.poisson(1000, lambda = 0.5)
x2 <- rpois(1000, lambda = 0.5)

## set breaks reasonably when making a histogram
xmax <- max(x1, x2) + 0.5
par(mfrow = c(1, 2))
hist(x1, main = "proof-of-concept-implementation", breaks = seq.int(-0.5, xmax))
hist(x2, main = "R's rpois()", breaks = seq.int(-0.5, xmax))

Answer 2

A vectorized version will run much faster than a non-vectorized function using replicate .矢量化版本将比使用replicate的非矢量化函数运行得快得多。 The idea is to iteratively drop the uniform random samples as i is incremented.这个想法是随着i的增加迭代地丢弃均匀的随机样本。

r.poisson1 <- function(n, l = 0.5) {
  U <- runif(n)
  i <- 0L
  X <- integer(n)
  p <- exp(-l)
  F <- p
  idx <- 1:n
  while (length(idx)) {
    bln <- U < F
    X[idx[bln]] <- i
    p <- l*p/(i <- i + 1L)
    F <- F + p
    idx <- idx[!bln]
    U <- U[!bln]
  }
  X
}

@Zheyuan Li's non-vectorized functions: @Zheyuan Li 的非向量化函数：

rpois1 <- function (lambda) {
  ## step 1
  U <- runif(1)
  ## step 2
  i <- 0
  p <- exp(-lambda)
  F <- p
  ## you need an "infinite" loop
  ## no worry, it will "break" at some time
  repeat {
    ## step 3
    if (U < F) {
      X <- i
      break
    }
    ## step 4
    i <- i + 1
    p <- lambda * p * i
    F <- F + p
    ## back to step 3
  }
  return(X)
}

r.poisson2 <- function (n, lambda) {
  ## use `replicate()` to call `rpois1` n times
  replicate(n, rpois1(lambda))
}

Benchmark:基准：

microbenchmark::microbenchmark(r.poisson1(1e5),
                               r.poisson2(1e5, 0.5),
                               rpois(1e5, 0.5))
#> Unit: milliseconds
#>                    expr        min         lq       mean     median         uq        max neval
#>       r.poisson1(1e+05)   3.063202   3.129151   3.782200   3.225402   3.734600  18.377700   100
#>  r.poisson2(1e+05, 0.5) 217.631002 244.816601 269.692648 267.977001 287.599251 375.910601   100
#>       rpois(1e+05, 0.5)   1.519901   1.552300   1.649026   1.579551   1.620451   7.531401   100

尝试使用逆 CDF 方法模拟泊松样本，但我的 R 函数会产生错误的结果

问题描述

2 个解决方案

解决方案1
0 2022-07-14 17:55:17

解决方案2
0 2022-07-14 20:46:04

尝试使用逆 CDF 方法模拟泊松样本，但我的 R 函数会产生错误的结果

问题描述

2 个解决方案

解决方案1 0 2022-07-14 17:55:17

解决方案2 0 2022-07-14 20:46:04

解决方案1
0 2022-07-14 17:55:17

解决方案2
0 2022-07-14 20:46:04