简体   繁体   中英

Sample from custom distribution in R

I have implemented an alternate parameterization of the negative binomial distribution in R, like so (also see here ):

nb = function(n, l, a){
  first = choose((n + a - 1), a-1)
  second = (l/(l+a))^n
  third = (a/(l+a))^a
  return(first*second*third)
}

Where n is the count, lambda is the mean, and a is the overdispersion term.

I would like to draw random samples from this distribution in order to validate my implementation of a negative binomial mixture model, but am not sure how to go about doing this. The CDF of this function isn't easily defined, so I considered trying rejection sampling as discussed here , but that didn't work either (and I'm not sure why- the article says to first draw from a uniform distribution between 0 and 1, but I want my NB distribution to model integer counts...I'm not sure if I understand this approach fully.)

Thank you for your help.

I recommend you look up the Uniform distribution as well as the Universality of the Uniform. You can do exactly what you want by passing a uniformly distributed variable to the inverse CDF of the NB Binomial and what you will get is set of points sampled from your NB Binomial distribution.

EDIT: I see that the negative binomial has a CDF which has no closed form inverse. My second recommendation would be to scrap your function and use a built-in:

library(MASS)
rnegbin(n, mu = n, theta = stop("'theta' must be specified"))

It seems like you could:

1) Draw a uniform random number between zero and one.

2) Numerically integrate the probability density function (this is really just a sum, since the distribution is discrete and lower-bounded at zero).

3) Whichever value in your integration takes the cdf past your random number, that's your random draw.

So all together, do something like the following:

r <- runif(1,0,1)
cdf <- 0
i <- -1
while(cdf < r){
  i <- i+1
  p <- PMF(i)
  cdf <- cdf + p
}

Where PMF(i) is the probability mass over a count of i, as specified by the parameters of the distribution. The value of i when this while-loop finishes is your sample.

If you really just want to test and so speed is not the issue, the inversion method, as mentioned by others, is probably the way to go.

For a discrete random variable, it requires a simple while loop. See Non-Uniform Random Variate Generation by L. Devroye, chapter 3, p. 85.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM