简体   繁体   中英

How to specify tail value of a Poisson distribution in scipy.stats?

I want to create a Poisson distribution with mean is 2 number of elements is 10000 has min value 1 and tail value 140 so far I can only specify the min

 stats.poisson.rvs( 2, loc = 1,size=10000)

and the following distribution is generated 在此处输入图片说明

How can I make it end at 140 instead of 11

The chance that a sample out of a Poisson distribution with mean equal to 2 is greater than 140 is so small that you would not get one out of just 10000 samples.

Indeed, the Poisson distribution has only one parameter λ and a probability mass function defined so that

P(x=k) = λ^k * exp(-λ) / k!

The mean value is also equal to λ. If λ = 2 then

P(x=140) = 7.7e-199

so if there are 10000 samples the chance that there would be at least one sample at 140 out of 10000 would be less than 7.7e-195. This is a number so small that you cannot expect this to occur in a lifetime.

It is a bit harder to compute the probability that a sample out of Poisson distribution with λ=2 lies above 140. You can use scipy.stats.poisson.cdf to see that

P(x>=22) = 1 - scipy.stats.poisson.cdf(21,2) = 5.5e-16

Therefore even the chance that you would have one sample out of 10000 above 21 is less than 5.5e-12. Computing P(x>=140) in the same way would return 0 because of floating point rounding in intermediate results.

Conclusion

If you want distribution mean equal 2.0, and a heavy tail reaching up to 140 on 10000 samples you need a distribution different from Poisson. You could consider Pareto distribution, scipy.stats.pareto with parameter b = 2.

Here is a comparison of 10000 random samples from

scipy.stats.poisson.rvs(2,size=10000)

and

numpy.rint(scipy.stats.pareto.rvs(2,size=10000))

泊松分布与帕累托分布的比较

It is clearly visible that Pareto distribution with the same mean has a much heavier tail.

For reference the code for the plot is below

import matplotlib.pyplot as plt
import scipy.stats
import numpy as np
pareto_x = np.rint(scipy.stats.pareto.rvs(2,size=10000))
poisson_x = scipy.stats.poisson.rvs(2,size=10000)
plt.figure(figsize=(8,4))
plt.subplot(121)
plt.title("Poisson distribution, a = 2")
plt.xlabel("sample number")
plt.ylabel("sample value")
plt.axis([0,10000,0,180])
plt.plot(range(0,10000),poisson_x,"o")
plt.subplot(122)
plt.axis([0,10000,0,180])
plt.title("Pareto distribution, b = 2")
plt.xlabel("sample number")
plt.plot(range(0,10000),pareto_x,"o")
plt.subplots_adjust(hspace=0.4,bottom=0.2)
plt.savefig("poisson_pareto.png")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM