SciPy.optimize.least_squares() 5PL曲线优化问题

Question

I am trying to write a script that will take an input array of x and y values and fit them to a 5-PL curve (defined by the equation F(x) = D+(AD)/((1+(x/C)^B)^E)).我正在尝试编写一个脚本，它将采用 x 和 y 值的输入数组并将它们拟合到 5-PL 曲线（由等式 F(x) = D+(AD)/((1+(x/C )^B)^E))。 I then want to be able to use the predicted curve to take a given y value and extrapolate an x value from the curve, given by the equation F(y) = C(((AD)/(-D+y))^(1/E)-1)^(1/B).然后我希望能够使用预测曲线获取给定的 y 值并从曲线中推断出 x 值，由方程 F(y) = C(((AD)/(-D+y))^ (1/E)-1)^(1/B)。

The answer below fixed the previous error, but the fit is still really bad.下面的答案修复了之前的错误，但是贴合度还是很差。 I've introduced a print function with a handful of y values across the range fed into curve_fit, and it yields almost the exact same x value across the range.我介绍了一个打印 function ，其中有几个 y 值在整个范围内输入到 curve_fit 中，它在整个范围内产生几乎完全相同的 x 值。 Any ideas what may be going on here?有什么想法可能发生在这里吗？

Edit: For anyone looking now, the problem appears to have been my estimate for B. The hill slope should be between -1 and 1 in most cases, not in the thousands.编辑：对于现在看的人来说，问题似乎是我对 B 的估计。在大多数情况下，山坡应该在 -1 和 1 之间，而不是数千。 That made it too far to estimate.这使得估计太远了。

import numpy as np
import scipy.optimize as sp


def logistic5(x, A, B, C, D, E):
    '''5PL logistic equation'''
    log = D + (A-D)/(np.power((1 + np.power((x/C), B)), E))
    return log


def residuals(p, y, x):
    '''Deviations of data from fitted 5PL curve'''
    A, B, C, D, E = p
    err = y - logistic5(x, A, B, C, D, E)
    print(err)
    return err


def log_solve_for_x(curve, y):
    '''Returns the estimated x value for the provided y value'''
    A, B, C, D, E = curve
    return C*(np.power((np.power(((A-D)/(-D+y)), (1/E))-1), (1/B)))


# Toy data set
x = np.array([130, 38, 15, 4.63, 1.41])
y = np.array([9121, 1987, 1017, 343, 117])

# Set initial guess for parameters
A = np.amin(y)  # Min asymptote
D = np.amax(y)  # Max asymptote
B = (D-A)/(np.amax(x)-np.amin(x))  # Steepness
C = (np.amax(x)-np.amin(x))/2  # inflection point
E = 1  # Asymmetry factor

# Optimize curve for initial parameters
p0 = [A, B, C, D, E]
# set bounds for each parameter
pu = []
pl = []
for p in p0:
    pu.append(p*1.5)
    pl.append(p*0.5)
print(pu)
print(pl)
print("Initial guess of parameters is: ", p0)
curve = sp.least_squares(fun=residuals, x0=p0, args=(y, x), bounds=(pl, pu))
curve = curve.x.tolist()
print("Optimized curve parameters are: ", curve)

# Predict x values based on given y
y = [1000, 2000, 3000, 4000, 5000, 6000, 7000, 8000, 9000]
for sample in y:
    solve = log_solve_for_x(curve, sample)
    print("Predicted X value for y =", sample, " is: ", solve)

Answer 1

Your curve is defined not for any values of parameters.您的曲线不是为任何参数值定义的。 But you did not provide that information for least_squares .但是您没有为least_squares提供该信息。 At some point, the solver goes to an inadmissible zone and gets stuck there obtaining nans from residuals and you get messages about invalid power.在某些时候，求解器会进入一个不允许的区域并卡在那里从residuals中获取 nans，并且您会收到有关无效功率的消息。 You have trivial powers and may just set the E>=0, B>=0 .你有微不足道的权力，可能只是设置E>=0, B>=0 。 But your base is non-trivial.但是您的基础并非微不足道。 You either need to switch to a solver that supports generic constraints (eg scipy.optimize.minimize ) and add constraints that base >=0 or somehow else restrict the search to the admissible domain, eg:您要么需要切换到支持通用约束的求解器（例如scipy.optimize.minimize ）并添加base >=0的约束，要么以其他方式将搜索限制在允许的域，例如：

pu = []
pl = []
for p in p0:
    pu.append(p*1.5)
    pl.append(p*.5)

curve = sp.least_squares(fun=residuals, x0=p0, args=(y, x), bounds=(pl, pu))

You also may try to fix your residual that it works for any parameters, eg replace nan with distance to the initial guess.您也可以尝试修复它适用于任何参数的残差，例如将 nan 替换为与初始猜测的距离。 But it may work inefficiently.但它可能效率低下。

To improve fitting results you may try a better initial point or multistart or both.为了改善拟合结果，您可以尝试更好的初始点或多起点或两者兼而有之。

A = np.amin(y)  # Min asymptote
D = np.amax(y)  # Max asymptote
B = (D-A)/np.amax(x)*10  # Steepness
C = np.amax(x)/10  # inflection point
E = 0.001  # Asymmetry factor

p0 = [A, B, C, D, E]
print("Initial guess of parameters is: ", p0)
pu = []
pl = []
for p in p0:
    pu.append(p*1.5)
    pl.append(p*.5)

best_cost = np.inf
for i in range(100):
    for i in range(5):
        p0[i] = np.random.uniform(pl[i], pu[i])

    curve = sp.least_squares(fun=residuals, x0=p0, args=(y, x), bounds=(pl, pu))
    print(p0, curve.cost)
    if best_cost > curve.cost:
        best_cost = curve.cost
        curve_out = curve.x.tolist()
print("Optimized curve parameters are: ", curve_out)

plt.plot(x, y, '.')

xx = np.linspace(0, 150, 100)
yy = []
for x in xx:
    yy.append(logistic5(x, *curve_out))

plt.plot(xx, yy)
plt.show()

SciPy.optimize.least_squares() 5PL曲线优化问题

问题描述

1 个解决方案

解决方案1
0 已采纳 2021-02-06 20:16:43

SciPy.optimize.least_squares() 5PL曲线优化问题

问题描述

1 个解决方案

解决方案1 0 已采纳 2021-02-06 20:16:43

解决方案1
0 已采纳 2021-02-06 20:16:43