[英]Fitting a binomial distribution to a curve with python
我正在嘗試使此列表符合二項分布:[0, 1, 1, 1, 3, 5, 5, 9, 14, 20, 12, 8, 5, 3, 6, 9, 13, 15, 18, 23, 27, 35, 25, 18, 12, 10, 9, 5, 0]
我需要檢索分布的參數,以便將其應用於我需要做的一些模擬。 我正在使用 scipy:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.stats import binom
data = [0, 1, 1, 1, 3, 5 , 5, 9, 14, 20, 12, 8, 5, 3, 6, 9, 13, 15, 18, 23, 27, 35, 25, 18, 12, 10, 9, 5 , 0]
def fit_function(x, n, p):
return binom.pmf(x, n, p)
num_bins = 10
params, covmat = curve_fit(fit_function, 10, data)
但我收到以下錯誤:
4 5 # fit with curve_fit ----> 6 parameters, cov_matrix = curve_fit(fit_function, 10, data) 中的 RuntimeError Traceback (most recent call last)
~\AppData\Local\Continuum\anaconda3\envs\py37\lib\site-packages\scipy\optimize\minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs) 746 cost = np.sum(infodict['fvec'] ** 2) 747 if ier not in [1, 2, 3, 4]: --> 748 raise RuntimeError("Optimal parameters not found: " + errmsg) 749 else: 750 # 如果指定,將 maxfev (leastsq) 重命名為 max_nfev (least_squares)。
RuntimeError:未找到最佳參數:對 function 的調用次數已達到 maxfev = 600。
無論錯誤如何,我如何將此數據擬合為 python 的二項式曲線?
看來你需要增加迭代次數maxfev,試試
params, covmat = curve_fit(fit_function, 10, data, maxfev=2000)
可以使用distfit
庫來檢索離散分布的參數。 一個小例子如下:
pip install distfit
# Generate random numbers
from scipy.stats import binom
# Set parameters for the test-case
n = 8
p = 0.5
# Generate 10000 samples of the distribution of (n, p)
X = binom(n, p).rvs(10000)
print(X)
[4 7 4 ... 2 2 6]
dfit = distfit(method='discrete')
# Search for best theoretical fit on your empirical data
dfit.fit_transform(X)
# Get the model and best fitted parameters.
print(dfit.model)
# {'distr': <scipy.stats._distn_infrastructure.rv_frozen at 0x1ff23e3beb0>,
# 'params': (8, 0.4999585504197037),
# 'name': 'binom',
# 'SSE': 7.786589839641551,
# 'chi2r': 1.1123699770916502,
# 'n': 8,
# 'p': 0.4999585504197037,
# 'CII_min_alpha': 2.0,
# 'CII_max_alpha': 6.0}
# Best fitted n=8 and p=0.4999 which is great because the input was n=8 and p=0.5
dfit.model['n']
dfit.model['p']
# The plot function
dfit.plot(chart='PDF',
emp_properties={'linewidth': 4, 'color': 'k'},
bar_properties={'edgecolor':'k', 'color':None},
pdf_properties={'linewidth': 4, 'color': 'r'})
免責聲明:我也是這個 repo 的作者。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.