[英]How to do curve fitting of a curve family with python/scipy
I have a to solve the following problem:我有一个解决以下问题:
I hope I could describe my problem, hope you guys can help, I would be very grateful!我希望我能描述我的问题,希望你们能提供帮助,我将非常感激!
Question Edited (Because of misunderstanding - 2020_04_04)问题已编辑(由于误解 - 2020_04_04)
I'll try to be more specific now, that for I have attached a picture where you can see an example of the “curve family” which changes for different “Sigma”.我现在会尝试更具体一些,因为我附上了一张图片,您可以在其中看到“曲线族”的示例,该示例会随着不同的“Sigma”而变化。 I want to describe those curve family with a pair of constants – C1, C2, C3 and C4 without changing them.
我想用一对常数来描述那些曲线族——C1、C2、C3 和 C4,而不改变它们。 The clue is to find an optimum of constants which can describe this curve family with just changing Sigma and T as variables.
线索是找到一个常数的最佳值,它可以仅以改变 Sigma 和 T 作为变量来描述这个曲线族。 Therefor I have to fit the parameters for a bunch of curves with a minimum of error.
因此,我必须以最小的误差拟合一堆曲线的参数。 Afterwards the equation should cover the whole family of curves by just changing “Sigma and T”.
之后,只需更改“Sigma 和 T”,方程就应该涵盖整个曲线族。
Best Regards!此致!
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
#Equation --> Eps_Cr = (C1*Sigma**C2*x**(C3+1)*e(-C4/T))/(C3+1)
def func(x, C1, C2, C3,C4):
Sigma = 20
T = 1
return (C1*Sigma**C2*x**(C3+1)*np.exp(-C4*1/T))/(C3+1)
#Example Data 1
xdata = [1, 10, 100, 1000, 10000, 100000]
ydata = [0.000382,0.000407,0.000658,0.001169,0.002205,0.004304]
#Example Data 2
xdata1 = [1, 10, 100, 1000, 10000, 100000]
ydata1 = [0.002164,0.002371,0.004441,0.008571,0.016811,0.033261]
#Example Data 3
xdata2 = [1, 10, 100, 1000, 10000, 100000]
ydata2 = [0.001332,0.001457,0.002707,0.005157,0.010007,0.019597]
plt.plot(xdata, ydata, 'b-', label='data')
plt.plot(xdata1, ydata1, 'g-', label='data')
plt.plot(xdata2, ydata2, 'y-', label='data')
popt, pcov = curve_fit(func, xdata, ydata)
plt.plot(xdata, func(xdata, *popt), 'r--',
label='fit: C1=%5.2e, C2=%5.3f, C3=%5.3f,C4=%5.3f' % tuple(popt))
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()
From the extra information you provided in the 'answer', it seems that you want to fit a hierarchical model.从您在“答案”中提供的额外信息来看,您似乎想要适应分层 model。 At least that is what statisticians often call them.
至少统计学家经常这么称呼他们。 Some parameters are shared between all data points (the parameters
C1
to C4
, and some parameters are shared within the groups of datasets ( T
and Sigma
). All these parameters need to be estimated from the data.一些参数在所有数据点之间共享(参数
C1
到C4
,一些参数在数据集组内共享( T
和Sigma
)。所有这些参数都需要从数据中估计。
This is often tackeled by building a larger model for all the data, and in the model one select which of the groupwise parameters to use.这通常通过为所有数据构建更大的 model 来解决,并在 model 中构建一个 select 来使用分组参数。 If a data points belong to data group
1
we choose Sigma1
and T1
and so on...如果一个数据点属于数据组
1
,我们选择Sigma1
和T1
等等......
Since you are already using curve_fit
, I made a version of your code that does the job.由于您已经在使用
curve_fit
,因此我制作了一个可以完成这项工作的代码版本。 The code style leaves a bit to ask for since I'm no expert in scipy
, but I think that you will understand the method at least.由于我不是
scipy
方面的专家,因此代码风格有点要求,但我认为您至少会理解该方法。
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
def func(x_and_grp, C1, C2, C3, C4, Sigma0, Sigma1, Sigma2, T0, T1, T2):
# We estimate one sigma and one T per group of data points
x = x_and_grp[:,0]
grp_id = x_and_grp[:,1]
# here we select the appropriate T and Sigma for each data point based on their group id
T = np.array([[T0, T1, T2][int(gid)] for gid in grp_id])
Sigma = np.array([[Sigma0, Sigma1, Sigma2][int(gid)] for gid in grp_id])
return (C1*Sigma**C2*x**(C3+1)*np.exp(-C4*1/T))/(C3+1)
#Example Data in 3 groups
xdata0 = [1, 10, 100, 1000, 10000, 100000]
ydata0 = [0.000382,0.000407,0.000658,0.001169,0.002205,0.004304]
xdata1 = [1, 10, 100, 1000, 10000, 100000]
ydata1 = [0.002164,0.002371,0.004441,0.008571,0.016811,0.033261]
xdata2 = [1, 10, 100, 1000, 10000, 100000]
ydata2 = [0.001332,0.001457,0.002707,0.005157,0.010007,0.019597]
# merge all the data and add the group id to the x-data vectors
y_all = np.concatenate([ydata0, ydata1, ydata2])
x_and_grp_all = np.zeros(shape=(3 * 6, 2))
x_and_grp_all[:, 0] = np.concatenate([xdata0, xdata1, xdata2])
x_and_grp_all[0:6, 1] = 0
x_and_grp_all[6:12, 1] = 1
x_and_grp_all[12:18, 1] = 2
# fit a model to all the data together
popt, pcov = curve_fit(func, x_and_grp_all, y_all)
xspace = np.logspace(1,5)
plt.plot(xdata0, ydata0, 'b-', label='data')
plt.plot(xdata1, ydata1, 'g-', label='data')
plt.plot(xdata2, ydata2, 'y-', label='data')
for gid,color in zip([0,1,2],['r','k','purple']):
T = popt[4+gid]
Sigma = popt[7+gid]
x_and_grp = np.column_stack([xspace,np.ones_like(xspace)*gid])
plt.plot(xspace,
func(x_and_grp, *popt),
linestyle='dashed', color=color,
label='fit: T=%5.2e, Sigma=%5.3f' % (T,Sigma))
plt.xlabel('X')
plt.ylabel('Y')
plt.title('fit: C1=%5.2e, C2=%5.3f, C3=%5.3f,C4=%5.3f' % tuple(popt[0:4]))
plt.legend()
plt.show()
The output looks like this: output 看起来像这样:
Finally, I want to add that curve_fit
is not that well suited for this task if you have a lot of different groups.最后,我想补充一点,如果您有很多不同的组,
curve_fit
不太适合这项任务。 Consider some other library that could be relevant.考虑一些其他可能相关的库。 Statmodels could be possible.
Statmodels 是可能的。 One alternative is to reach for
scipy.optimize.minimze
instead, since it gives you more flexibility.一种替代方法是使用
scipy.optimize.minimze
,因为它为您提供了更大的灵活性。 You need to do the confidence interval estimation manually though...您需要手动进行置信区间估计......
I also want to add that the method above is overly complicated if you know the T
and Sigma
for each group of data.我还想补充一点,如果您知道每组数据的
T
和Sigma
,则上述方法过于复杂。 In that case we add the relevant value of Sigma
and T
to the x-vector, instead of a group id.在这种情况下,我们将
Sigma
和T
的相关值添加到 x 向量,而不是组 id。
As per your query, I can understand that you need to fit one equation for three different datasets separately.根据您的查询,我可以理解您需要分别为三个不同的数据集拟合一个方程。 So, I updated your code for the same by keeping sigma and T the same.
因此,我通过保持 sigma 和 T 相同来更新您的代码。 Please have a look and let me know further.
请看一下,让我进一步了解。
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
#Equation --> Eps_Cr = (C1*Sigma**C2*x**(C3+1)*e(-C4/T))/(C3+1)
def func(x, C1, C2, C3,C4):
Sigma = 20
T = 1
return (C1*Sigma**C2*x**(C3+1)*np.exp(-C4*1/T))/(C3+1)
#Example Data 1
xdata = [1, 10, 100, 1000, 10000, 100000]
ydata = [0.000382,0.000407,0.000658,0.001169,0.002205,0.004304]
#Example Data 2
xdata1 = [1, 10, 100, 1000, 10000, 100000]
ydata1 = [0.002164,0.002371,0.004441,0.008571,0.016811,0.033261]
#Example Data 3
xdata2 = [1, 10, 100, 1000, 10000, 100000]
ydata2 = [0.001332,0.001457,0.002707,0.005157,0.010007,0.019597]
plt.plot(xdata, ydata, 'b-', label='data 1')
plt.plot(xdata1, ydata1, 'g-', label='data 2')
plt.plot(xdata2, ydata2, 'y-', label='data 3')
popt, pcov = curve_fit(func, xdata, ydata)
popt1, pcov1 = curve_fit(func, xdata1, ydata1)
popt2, pcov2 = curve_fit(func, xdata2, ydata2)
plt.plot(xdata, func(xdata, *popt), 'r.',
label='fit for Data 1: C1=%5.2e, C2=%5.3f, C3=%5.3f,C4=%5.3f' % tuple(popt))
plt.plot(xdata1, func(xdata1, *popt1), 'r+',
label='fit for Data 2: C1=%5.2e, C2=%5.3f, C3=%5.3f,C4=%5.3f' % tuple(popt1))
plt.plot(xdata2, func(xdata2, *popt2), 'r--',
label='fit for Data 3 : C1=%5.2e, C2=%5.3f, C3=%5.3f,C4=%5.3f' % tuple(popt2))
plt.xlabel('X')
plt.ylabel('Y')
plt.legend(loc='upper left',prop={'size': 8})
plt.show()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.