Python - 最小化卡方

Question

I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. 我一直试图通过最小化卡方来拟合线性模型到一组应力/应变数据。 Unfortunately using the code below is not correctly minimizing the chisqfunc function. 不幸的是，使用下面的代码并没有正确地最小化chisqfunc函数。 It is finding the minimum at the initial conditions, x0 , which is not correct. 它在初始条件下找到最小值x0 ，这是不正确的。 I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. 我查看了scipy.optimize文档并测试了最小化其他正常工作的函数。 Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared? 您能否建议如何修复下面的代码或建议另一种方法，我可以通过最小化卡方来使线性模型适应数据？

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result

Thank you for reading my question and any help would be greatly appreciated. 感谢您阅读我的问题，我们将非常感谢您的帮助。

Cheers, Will 干杯，威尔

EDIT: Data set I am currently using: Link to data 编辑：我目前使用的数据集：链接到数据

Answer 1

The problem is that your initial guess is very far from the actual solution. 问题是你最初的猜测与实际解决方案相差甚远。 If you add a print statement inside chisqfunc() like print (a,b) , and rerun your code, you'll get something like: 如果你在chisqfunc()添加一个print语句，比如print (a,b) ，并重新运行你的代码，你会得到类似的东西：

(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)

This means that minimize evaluates the function only at these points. 这意味着minimize仅在这些点评估函数。

if you now try to evaluate chisqfunc() at these 3 pairs of values, you'll see that they EXACTLY match, for example 如果您现在尝试在这3对值中评估chisqfunc() ，您会看到它们完全匹配，例如

print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True

This happens because of rounding floating points arithmetics. 这是因为舍入浮点算术。 In other words, when evaluating stress - model , the var stress is too many order of magnitude larger than model , and the result is truncated. 换句话说，在评估stress - model ，var stress比model大许多个数量级，结果被截断。

One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128) just after loading the data with loadtxt . 然后可以尝试强制它，增加浮点精度，在使用loadtxt加载数据之后写入data=data.astype(np.float128) 。 minimize fails, with result.success=False , but with a helpful message minimize失败， result.success=False ，但有一个有用的消息

Desired error not necessarily achieved due to precision loss. 由于精度损失，不一定能实现所需的误差。

One possibility is then to provide a better initial guess, so that in the subtraction stress - model the model part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0) . 一种可能性是提供更好的初始猜测，以便在减法stress - model中model部分具有相同的数量级，另一种重新缩放数据，因此解决方案将更接近您的初始猜测(0,0) 。

It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material) 它是好得多 ，如果你只是重新缩放数据，使得例如无量纲相对于一定的应力值（如yelding /该材料的开裂）

This is an example of the fitting, using as a stress scale the maximum measured stress. 这是拟合的一个示例，使用最大测量应力作为应力标度。 There are very few changes from your code: 您的代码中的更改很少：

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]


smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)

Your linear model is quite good, ie your material has a very linear behaviour for this range of deformation (what material is it anyway?): 你的线性模型非常好，即你的材料在这个变形范围内具有非常线性的行为（无论如何它都是什么材料？）： 在此输入图像描述

Python - 最小化卡方

问题描述

1 个解决方案

解决方案1
3 已采纳 2014-03-05 16:17:33

Python - 最小化卡方

问题描述

1 个解决方案

解决方案1 3 已采纳 2014-03-05 16:17:33

解决方案1
3 已采纳 2014-03-05 16:17:33