简体   繁体   中英

pdf estimation with scipy.stats

Say I compute the density of Beta(4,8):

from scipy.stats import beta
rv = beta(4, 8)
x = np.linspace(start=0, stop=1, num=200)
my_pdf = rv.pdf(x)

Why does the integral of the pdf not equal one?

> my_pdf.sum()
199.00000139548044

The integral over the pdf is one. You can see this by using numerical integration from scipy

>>> from scipy.integrate import quad
>>> quad(rv.pdf, 0, 1)
(0.9999999999999999, 1.1102230246251564e-14)

or by writing your own ad-hoc integration (with a trapezoidal rule in this example)

>>> x = numpy.linspace(start=0, stop=1, num=201)
>>> (0.5 * rv.pdf(x[0]) + rv.pdf(x[1:-1]).sum() + 0.5 * rv.pdf(x[-1])) / 200.0
1.0000000068732813

rv.pdf returns the value of the pdf at each value of x . It doesn't sum to one because your aren't actually computing an integral. If you want to do that, you need to divide your sum by the number of intervals, which is len(x) - 1 , which is 199. That would then give you a result very close to 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM