curve_fit拟合高度相关数据的问题

Question

For my bachelor thesis, I am working on a project where I want to perform a fit to some data. 对于我的学士论文，我正在开展一个项目，我想对某些数据进行拟合。 The problem is a bit more complex, but I tried to minimize the problem here: 问题有点复杂，但我试图在这里尽量减少问题：

We have three data points (very little theory data is available), but these points are highly correlated. 我们有三个数据点（很少有理论数据可用），但这些点是高度相关的。

Using curve_fit to fit these points, we get a horrible fit result, as you can see in this picture. 使用curve_fit来拟合这些点，我们得到了一个非常合适的结果，如图所示。 (The fit could be easily improved by altering the fit parameters by hand). （通过手动改变拟合参数可以很容易地改善拟合）。

Our fit results with correlations (blue) and with neglected correlations (orange): 我们的拟合结果具有相关性（蓝色）和忽略的相关性（橙色）：

The results get better when we use more parameters (as the fits essentially behave like solves by then). 当我们使用更多参数时，结果会变得更好（因为拟合基本上就像当时的解决方案一样）。

My question: Why does this behaviour happen? 我的问题：为什么会出现这种情况？ (We use our own least-squares algorithm for our specific problem, but it suffers from the same problem). （我们使用我们自己的最小二乘算法来解决我们的特定问题，但它遇到了同样的问题）。 Is it a numerical problem, or is there any good reason for curve_fit to show this solution? 这是一个数值问题，还是有什么理由让curve_fit显示这个解决方案？

I would be very happy to have a good explanation to say why we can't use "only 2" parameters to fit these highly correlated 3 datapoints. 我很乐意有一个很好的解释说明为什么我们不能使用“仅2”参数来拟合这些高度相关的3个数据点。

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
plt.rcParams['lines.linewidth'] = 1

y = np.array([1.1994, 1.0941, 1.0047])
w = np.array([1, 1.08, 1.16])
cor = np.array([[1, 0.9674, 0.8812],[0.9674, 1, 0.9523], [0.8812, 0.9523, 1]])
s = np.array([0.0095, 0.0104, 0.0072])

def f(x, a, b):
    return a + b*x

cov = np.zeros((3,3))
for i in range(3):
    for j in range(3):
        cov[i,j] = cor[i,j] * s[i] * s[j]

A1, B1 = curve_fit(f, w, y, sigma=cov)
A2, B2 = curve_fit(f, w, y)

plt.plot(w, f(w, *A1))
plt.plot(w, f(w, *A2))

plt.scatter(w, y)
plt.show()

Answer 1

It is not a numerical problem. 这不是一个数字问题。 The "problem" is that the off-diagonal terms of your covariance matrix are all positive and relatively large. “问题”是协方差矩阵的非对角线项都是正的且相对较大。 These determine the correlations among the errors in the fit, so if all the terms are positive, you are saying that all the errors are positively correlated. 这些确定了拟合中误差之间的相关性，因此如果所有术语都是正数，则表示所有误差都是正相关的。 If one is large, the others will tend to also be large with the same sign. 如果一个人很大，那么其他人也会因为同一个标志而变大。

Here's an example similar to yours, with the covariance matrix 这是一个类似于你的例子，带有协方差矩阵

        [2.0  1.3  0.0]
sigma = [1.3  2.0  1.3]
        [0.0  1.3  2.0]

(The condition number of this matrix is 23.76, so we shouldn't expect any numerical problems.) （该矩阵的条件数为23.76，因此我们不应期待任何数值问题。）

While the covariance between the first and third points is 0, it is 1.3 between the first and second, and between the second and third, and 1.3 is a relatively large fraction of the variances, which are all 2. So it will not be surprising if all the errors in the fitted model have the same sign. 虽然第一点和第三点之间的协方差是0，但是在第一点和第二点之间是1.3，在第二点和第三点之间是1.3，并且1.3是方差的相对大部分，它们都是2.所以它不会令人惊讶如果拟合模型中的所有误差都具有相同的符号。

This script does a fit of three points and plots the data and the fitted line. 这个脚本有三个点，并绘制数据和拟合线。

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt


def f(x, a, b):
    return a + b*x


x = np.array([1, 2, 3])
y = np.array([2, 0.75, 0])
sig = np.array([[2.0, 1.3, 0.0],
                [1.3, 2.0, 1.3],
                [0.0, 1.3, 2.0]])

params, pcov = curve_fit(f, x, y, sigma=sig)

y_errors = f(x, *params) - y

plt.plot(x, y, 'ko', label="data")
plt.plot(x, f(x, *params), linewidth=2.5, label="fitted curve")
plt.vlines(x, y, f(x, *params), 'r')

for k in range(3):
    plt.annotate(s=r"$e_{%d}$" % (k+1), xy=(x[k]-0.05, y[k]+0.5*y_errors[k]), ha='right')

plt.xlabel('x')
plt.ylabel('y')
plt.axis('equal')
plt.grid()
plt.legend(framealpha=1, shadow=True)
plt.show()

As you can see in the plot, all the errors have the same sign. 正如您在图中看到的，所有错误都具有相同的符号。

We can confirm this reasoning by considering another covariance matrix, 我们可以通过考虑另一个协方差矩阵来证实这种推理，

        [ 2.0   1.3  -1.0]
sigma = [ 1.3   2.0  -1.3]
        [-1.0  -1.3   2.0]

In this case, all the off-diagonal terms are relatively large in magnitude. 在这种情况下，所有非对角线项的大小都相对较大。 The covariance between the first and second errors is positive, and it is negative between the second and third and between the first and third. 第一和第二错误之间的协方差是正的，并且在第二和第三错误之间以及在第一和第三错误之间是负的。 If these off-diagonal terms are large enough relative to the variances, we should expect the signs of the errors of the first two points to be the same, while the third error will have the opposite sign of the first two. 如果这些非对角线项相对于方差足够大，我们应该预期前两个点的误差的符号是相同的，而第三个误差将与前两个相反的符号相反。

Here's the plot generated by the script when sig is changed to the above matrix: 这是sig更改为上述矩阵时脚本生成的图：

The errors show the expected pattern. 错误显示预期的模式。

curve_fit拟合高度相关数据的问题

问题描述

1 个解决方案

解决方案1
3 已采纳 2017-06-29 07:15:09

curve_fit拟合高度相关数据的问题

问题描述

1 个解决方案

解决方案1 3 已采纳 2017-06-29 07:15:09

解决方案1
3 已采纳 2017-06-29 07:15:09