简体   繁体   中英

fit exponential cdf to data python?

I am trying to fit an exponential CDF to my data to see if it is a good fit/develop an equation from the fit, but am not sure how since I think scipy.stats fits the PDF, not the CDF. If I have the data below:

eta = [1,0.5,0.3,0.25,0.2];
q = [1e-9,9.9981e-10,9.9504e-10,9.7905e-10,9.492e-10];

How do I fit an exponential CDF to the data? Or how do find the distribution that fits the data the best?

You can define a general exp function, and use curve_fit from scipy.optimize:

import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit

def exp_func(x, a, b, c):
    return a * np.exp(-b * x) + c

eta = np.array([1,0.5,0.3,0.25,0.2])
cdf = np.array([1e-9,9.9981e-10,9.9504e-10,9.7905e-10,9.492e-10])
popt, pcov = curve_fit(exp_func, eta, cdf)
plt.plot(eta, cdf)
plt.plot(eta, exp_func(eta, *popt), 'r-', label='fit: a=%5.3f, b=%5.3f, c=%5.3f' % tuple(popt))
plt.legend()
plt.show()

And you'll get an exp function which is very similar to your values: 在此处输入图片说明

From the fitted parameters, you can see the function is y=np.exp(-19.213 * x).

* Update *

If you want to make sure this is really a CDF function, you'll need to calculate the pdf (by taking the derivative):

x = np.linspace(0, 1, 1000)
cdf_fit = exp_func(x, *popt)
cdf_diff = np.r_[cdf_fit[0], np.diff(cdf_fit)]

You can do a sanity check:

plt.plot(x, np.cumsum(cdf_diff))

And then use scipy to fit the pdf to an exponent distribution:

from scipy.stats import expon
params = expon.fit(cdf_diff)
pdf_fit = expon.pdf(x, *params)

I must warn you the something doesn't sum up. pdf_fit doesn't align with cdf_diff. Maybe your CDF isn't a real distribution function? The last value of a CDF should be 1.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM