简体   繁体   English

Python - 最小化卡方

[英]Python - Minimizing Chi-squared

I have been trying to fit a linear model to a set of stress/strain data by minimizing chi-squared. 我一直试图通过最小化卡方来拟合线性模型到一组应力/应变数据。 Unfortunately using the code below is not correctly minimizing the chisqfunc function. 不幸的是,使用下面的代码并没有正确地最小化chisqfunc函数。 It is finding the minimum at the initial conditions, x0 , which is not correct. 它在初始条件下找到最小值x0 ,这是不正确的。 I have looked through the scipy.optimize documentation and tested minimizing other functions which has worked correctly. 我查看了scipy.optimize文档并测试了最小化其他正常工作的函数。 Could you please suggest how to fix the code below or suggest another method I can use to fit a linear model to data by minimizing chi-squared? 您能否建议如何修复下面的代码或建议另一种方法,我可以通过最小化卡方来使线性模型适应数据?

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result

Thank you for reading my question and any help would be greatly appreciated. 感谢您阅读我的问题,我们将非常感谢您的帮助。

Cheers, Will 干杯,威尔

EDIT: Data set I am currently using: Link to data 编辑:我目前使用的数据集: 链接到数据

The problem is that your initial guess is very far from the actual solution. 问题是你最初的猜测与实际解决方案相差甚远。 If you add a print statement inside chisqfunc() like print (a,b) , and rerun your code, you'll get something like: 如果你在chisqfunc()添加一个print语句,比如print (a,b) ,并重新运行你的代码,你会得到类似的东西:

(0, 0)
(1.4901161193847656e-08, 0.0)
(0.0, 1.4901161193847656e-08)

This means that minimize evaluates the function only at these points. 这意味着minimize仅在这些点评估函数。

if you now try to evaluate chisqfunc() at these 3 pairs of values, you'll see that they EXACTLY match, for example 如果您现在尝试在这3对值中评估chisqfunc() ,您会看到它们完全匹配,例如

print chisqfunc((0,0))==chisqfunc((1.4901161193847656e-08,0))
True

This happens because of rounding floating points arithmetics. 这是因为舍入浮点算术。 In other words, when evaluating stress - model , the var stress is too many order of magnitude larger than model , and the result is truncated. 换句话说,在评估stress - model ,var stressmodel大许多个数量级,结果被截断。

One could then just try bruteforcing it, increasing floating point precision, with writing data=data.astype(np.float128) just after loading the data with loadtxt . 然后可以尝试强制它,增加浮点精度,在使用loadtxt加载数据之后写入data=data.astype(np.float128) minimize fails, with result.success=False , but with a helpful message minimize失败, result.success=False ,但有一个有用的消息

Desired error not necessarily achieved due to precision loss. 由于精度损失,不一定能实现所需的误差。

One possibility is then to provide a better initial guess, so that in the subtraction stress - model the model part is of the same order of magnitude, the other to rescale the data, so that the solution will be closer to your initial guess (0,0) . 一种可能性是提供更好的初始猜测,以便在减法stress - modelmodel部分具有相同的数量级,另一种重新缩放数据,因此解决方案将更接近您的初始猜测(0,0)

It is MUCH better if you just rescale the data, making for example nondimensional with respect to a certain stress value (like the yelding/cracking of this material) 它是好得多 ,如果你只是重新缩放数据,使得例如无量纲相对于一定的应力值(如yelding /该材料的开裂)

This is an example of the fitting, using as a stress scale the maximum measured stress. 这是拟合的一个示例,使用最大测量应力作为应力标度。 There are very few changes from your code: 您的代码中的更改很少:

import numpy
import scipy.optimize as opt

filename = 'data.csv'

data = numpy.loadtxt(open(filename,"r"),delimiter=",")

stress = data[:,0]
strain = data[:,1]
err_stress = data[:,2]


smax = stress.max()
stress = stress/smax
#I am assuming the errors err_stress are in the same units of stress.
err_stress = err_stress/smax

def chisqfunc((a, b)):
    model = a + b*strain
    chisq = numpy.sum(((stress - model)/err_stress)**2)
    return chisq

x0 = numpy.array([0,0])

result =  opt.minimize(chisqfunc, x0)
print result
assert result.success==True
a,b=result.x*smax
plot(strain,stress*smax)
plot(strain,a+b*strain)

Your linear model is quite good, ie your material has a very linear behaviour for this range of deformation (what material is it anyway?): 你的线性模型非常好,即你的材料在这个变形范围内具有非常线性的行为(无论如何它都是什么材料?): 在此输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM