简体   繁体   English

如何使用 python 中的最小化来将数据与方程拟合以获得 model 参数

[英]how to fit data with equations using minimize in python to obtain model parameters

I'm new to Python and programming.我是 Python 和编程新手。 I made the below code to get optimum model parameters (R0, t_inc, t_rec, ex, teta) by minimizing the error between the data and the model (several differential equations).我编写了以下代码,通过最小化数据和 model(几个微分方程)之间的误差来获得最佳 model 参数(R0、t_inc、t_rec、ex、teta)。 I am stuck at how to define the error function as seen in the code below我被困在如何定义错误 function 中,如下面的代码所示

import numpy as np
import pandas as pd
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import datetime
from lmfit import Parameters, fit_report, minimize

totaldays = 93  # as of today                
# This code is only to adjust the R0, t_infective, t_incubation to match data until to date
n_to_start = 0 # start data to fit
n_to_fit = totaldays # end data to fit -1, 
NumberofTest = 136.73*n_to_fit**2 - 17609*n_to_fit + 605936

# getting data till to date
DataMalaysia = pd.read_csv('DataMalaysia.csv')
Dates = DataMalaysia.iloc[:,0].values
TotalCase = DataMalaysia.iloc[:,3].values
TotalRecovered = DataMalaysia.iloc[:, 5].values
TotalDeath = DataMalaysia.iloc[:,4].values
Days = DataMalaysia.iloc[:, 1].values
ActiveCase = DataMalaysia.iloc[:, 7].values
Daystofit = Days[:n_to_fit] 
Dates = Dates[:n_to_fit]
start_date = datetime.date(2020,1,22)
ActiveCasetofit = ActiveCase[:n_to_fit]
TotalRecoveredtofit = TotalRecovered[:n_to_fit]
TotalDeathtofit = TotalDeath[:n_to_fit]


# parameter values including death and immigration
N = NumberofTest           # number of population
i_initial = 4       # 4 people is infected at the beginning 25th Jan 2020
# parameters around the Susceptible population (possible to get infected)
immigrating_s = 0   # fraction of population immigrating into the infected location
death_s = 0         # fraction of population died due to other diseases
# parameters around the Exposed or Infected people (but not yet Infecting)
immigrating_e = 0   # fraction of the infected people immigrating into the infected location
death_e = 0         # fraction of the infected/exposed people die due to other diseases
#parameters around the Infectious population
immigrating_i = 0   # fraction of the infectious people immigrating into the infected location
# death_i_MCO = 0.0157      # fraction of the infectious people die due to the virus
# mitigation effort

# variables to fit
R0 = 2.96   # reproduction number. This number is relatively high
t_inc = 11.93  # incubation period (5-6 is most reported one)
t_rec = 1.24   # infectious period, gamma = 1/t_infectious is the recovery rate, typical 3-4 days
ex = 0.016 #death fraction
teta = 0.1 # recovered fraction without getting ill

# using population
e0 = 0
i0 = i_initial
r0 = 0
d0 = 0
rprime0 = 0
s0 = N - e0 - i0 - r0 - d0 - rprime0

# SEIR model including MCO
def SEIR(x, t, R0, t_inc, t_rec, ex, teta):
    # introduction of the variables to calculate
    s, e, i, r, rprime, d = x
    alpha = 1/t_inc
    gamma = 1/t_rec
    R0t = R0/N
    beta = R0t*gamma
    # the differential equations
    dsdt = -(1-u)*beta * s * i + (immigrating_s - death_s)*s
    dedt = (1-u)*beta * s * i - alpha*e - teta*e + (immigrating_e - death_e)*e
    didt = alpha * e - gamma * i + (immigrating_i - ex)*i
    drdt = gamma*i
    drprimedt = teta*e
    dddt = ex*i

    return [dsdt, dedt, didt, drdt, drprimedt, dddt]

# integrating the SEIR model
def integrate_i(t, R0, t_inc, t_rec, ex, teta):
    x0 = s0, e0, i0, r0, rprime0, d0
    solution = odeint(SEIR, x0, t, args = (R0, t_inc, t_rec, ex, teta)).T
    solutiona = solution.T
    return solutiona[:, 2]

def integrate_r(t, R0, t_inc, t_rec, ex, teta):
    x0 = s0, e0, i0, r0, rprime0, d0
    solution = odeint(SEIR, x0, t, args = (R0, t_inc, t_rec, ex, teta)).T
    solutiona = solution.T
    return solutiona[:, 3]

def integrate_d(t, R0, t_inc, t_rec, ex, teta):
    x0 = s0, e0, i0, r0, rprime0, d0
    solution = odeint(SEIR, x0, t, args = (R0, t_inc, t_rec, ex, teta)).T
    solutiona = solution.T
    return solutiona[:, 5]

def integrate_total(t_total, R0, t_inc, t_rec, ex, teta):
    #slicing the time frame to each integration
    ti = t_total[:n_to_fit]
    td = t_total[len(ti)+len(ti):]
    tr = t_total[len(ti):len(t_total)-len(td)]
    result_i = integrate_i(ti, R0, t_inc, t_rec, ex, teta)
    result_r = integrate_r(tr, R0, t_inc, t_rec, ex, teta)
    result_d = integrate_d(td, R0, t_inc, t_rec, ex, teta)
    return np.concatenate([result_i, result_r, result_d])


def error(t_total, R0, t_inc, t_rec, ex, teta):
    R0 = 2.96   # reproduction number. This number is relatively high
    t_inc = 11.93  # incubation period (5-6 is most reported one)
    t_rec = 1.24   # infectious period, gamma = 1/t_infectious is the recovery rate, typical 3-4 days
    ex = 0.016 #death fraction
    teta = 0.1
    total_error = (np.sum((integrate_total(t_total, R0, t_inc, t_rec, ex, teta)-y_total)**2))

    return total_error

# fitting predictions with data points
start = n_to_start
end = n_to_fit
t = np.linspace(start, end, end)
ta = np.array(t)
# t_total = np.append(ta, ta, ta)
t_total = np.concatenate([ta, ta, ta])
y_total = np.concatenate([ActiveCasetofit, TotalRecoveredtofit, TotalDeathtofit])
p0=[R0, t_inc, t_rec, ex, teta]

params, extras = minimize(error, p0 ,method='BFGS',
                          options={'disp':True})

# Getting the optimized variables to plot what happens after to date
# getting the optimum values of R0, t_inc, t_rec, ex, teta
R0 = params[0]
t_incubation = params[1]
t_recovery = params[2]
death_ratio = params[3]
recovery_ratio = params[4]

#generation of the fitting curve
Predicted_ActiveCase = integrate_i(t, *params)
Predicted_RecoveredCase = integrate_r(t, *params)
Predicted_Death = integrate_d(t, *params)
print("Optimum R0 is {0:.2f}".format(params[0]))
print("Optimum Incubation Period is {0:.2f} days".format(params[1]))
print("Optimum Recovery Period is {0:.2f} days".format(params[2]))
print("Optimum Death Ratio is {0:.4f} ".format(params[3]))
print("Optimum Recovery Ratio Without Getting Ill is {0:.4f} ".format(params[4]))

# plotting data and results
fig1 = plt.figure()
t_fit = np.array([start_date+datetime.timedelta(days=i) for i in range(n_to_fit)])
plt.scatter(t_fit, ActiveCasetofit, c='red', label ='Data To Date')
plt.plot(t_fit, Predicted_ActiveCase, "r", label ='Fitted Active Case To Date')
plt.scatter(t_fit, TotalRecoveredtofit, c='blue', label ='Data To Date')
plt.plot(t_fit, Predicted_RecoveredCase, "b", label ='Fitted Total Recovered Case To Date')
plt.scatter(t_fit, TotalDeathtofit, c='green', label ='Data To Date')
plt.plot(t_fit, Predicted_Death, "g", label ='Fitted Death To Date')
plt.title('Fitting Data to Date')
plt.xlabel('Time/days')
plt.ylabel('Population')
plt.legend(loc='best')

It gives me the following error, which I think this is because I don't know how to input the arguments into the code.它给了我以下错误,我认为这是因为我不知道如何将 arguments 输入到代码中。 This is the whole error:这是整个错误:

runfile('C:/.../Python/SEIR_v8.py', wdir='C:/.../Python') Traceback (most recent call last): runfile('C:/.../Python/SEIR_v8.py', wdir='C:/.../Python') Traceback(最近一次调用最后):

File "C:...\Python\SEIR_v8.py", line 171, in params, extras = minimize(error, p0, method='BFGS')文件“C:...\Python\SEIR_v8.py”,第 171 行,在参数中,附加值 = 最小化(错误,p0,方法='BFGS')

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py", line 2393, in minimize return fitter.minimize(method=method)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py”,第 2393 行,最小化返回 fitter.minimize(method=method)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py", line 2176, in minimize return function(**kwargs)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py”,第 2176 行,最小化返回函数(**kwargs)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py", line 931, in scalar_minimize ret = scipy_minimize(self.penalty, variables, **fmin_kws)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py”,第 931 行,在 scalar_minimize ret = scipy_minimize(self.penalty, variables, **fmin_kws)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize_minimize.py", line 604, in minimize return _minimize_bfgs(fun, x0, args, jac, callback, **options)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize_minimize.py”,第 604 行,最小化返回 _minimize_bfgs(fun, x0, args, jac ,回调,**选项)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize\optimize.py", line 1003, in _minimize_bfgs old_fval = f(x0)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize\optimize.py”,第 1003 行,在 _minimize_bfgs old_fval = f(x0)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize\optimize.py", line 327, in function_wrapper return function(*(wrapper_args + args))文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\scipy\optimize\optimize.py”,第 327 行,在 function_wrapper 返回函数(*(wrapper_args + args ))

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py", line 598, in penalty r = self.__residual(fvars, apply_bounds_transformation)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py”,第 598 行,罚款 r = self.__residual(fvars, apply_bounds_transformation)

File "C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py", line 530, in __residual out = self.userfcn(params, *self.userargs, **self.userkws)文件“C:\Users...\AppData\Local\Continuum\anaconda3\envs\PythonNew\lib\site-packages\lmfit\minimizer.py”,第 530 行,在 __residual out = self.userfcn(params, *self .userargs, **self.userkws)

TypeError: error() missing 5 required positional arguments: 'R0', 't_inc', 't_rec', 'ex', and 'teta' TypeError:error() 缺少 5 个必需的位置 arguments:'R0'、't_inc'、't_rec'、'ex' 和 'teta'

Please help.请帮忙。

Regards,问候,

Zulfan祖尔凡

You are calling minimize() incorrectly.您错误地调用了minimize() I'm too tired now to figure out the details.我现在太累了,无法弄清楚细节。 I suggest you carefully read the scipi documentation.我建议你仔细阅读 scipi 文档。 You need to pass in a vector x0 as an initial guess as well as args which are the fixed parameters of the function you are minimizing.您需要传入一个向量x0作为初始猜测以及args ,它们是您要最小化的 function 的固定参数。 As it stands, you are only passing in x0 , but your error function expects additional parameters.就目前而言,您只传入x0 ,但您的error function 需要额外的参数。

I modified the error function as below and it works.我修改了错误 function 如下,它可以工作。

def error(params, t_total, y_total):
    R0 = params['R0'].value
    t_inc = params['t_inc'].value
    t_rec = params['t_rec'].value
    ex = params['ex'].value
    teta = params['teta'].value

    y_model = integrate_total(t_total, R0, t_inc, t_rec, ex, teta)
    return y_model - y_total


params = lmfit.Parameters()
params.add('R0', 3.65, min=0, max=5.0)
params.add('t_inc', 15, min=0, max=20.0)
params.add('t_rec', 1, min=0, max=2.0)
params.add('ex', 0.02, min=0, max=1.0)
params.add('teta', 0.1, min=0, max=1.0)

# fitting predictions with data points
start = n_to_start
end = n_to_fit
t = np.linspace(start, end, end)
ta = np.array(t)
t_total = np.concatenate([ta, ta, ta])
y_total = np.concatenate([ActiveCasetofit, TotalRecoveredtofit, TotalDeathtofit])

o1 = lmfit.minimize(error, params, args=(t_total, y_total), method='powell')
print("# Fit using leastsq:")
lmfit.report_fit(o1)

Then, I can vary the solver and initial points to get better fits.然后,我可以改变求解器和初始点以获得更好的拟合。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM