简体   繁体   English

Python中3维数据的曲线拟合

[英]Curve Fitting For 3 dimensional data in python

I have X and Y data with (7,360,720) dimension (global grid cells with 0.5 resolution) as input data and I want to fit Sigmoid curve with below code and obtaining curve parameters in the same shape as X and Y: 我将(7,360,720)维(具有0.5分辨率的全局网格单元)的X和Y数据作为输入数据,并且我想用以下代码拟合Sigmoid曲线,并获得与X和Y形状相同的曲线参数:

# -*- coding: utf-8 -*-
import  os, sys
from    collections import OrderedDict  as odict
import  numpy   as np
import  pylab   as pl
import numpy.ma as ma
from scipy.optimize import curve_fit


f=open('test.csv','w')
def sigmoid(x,a,b, c):
        y = a+(b*(1 - np.exp(-c*(x**2))))
        return y
for i in range(360):
      for j in range(720):
        xdata=[0,x[0,i,j],x[1,i,j],x[2,i,j],x[3,i,j],x[4,i,j],x[5,i,j],x[6,i,j]] 
        ydata=[0,y[0,i,j],y[1,i,j],y[2,i,j],y[3,i,j],y[4,i,j],y[5,i,j],y[6,i,j]]
        popt, pcov = curve_fit(sigmoid, xdata, ydata)
        print popt
        f.write(','.join(map(str,popt)))
        f.write("\n")
f.close()

Now this code write and sore fitting result in .csv file with 3 columns(a,b,c), but I want o write and store fitting result in the file with (360,720) shape as grid cells. 现在,此代码在3列(a,b,c)的.csv文件中写入并拟合拟合结果,但是我想在(360,720)形状的文件中将拟合结果写入并存储为网格单元。 also this code show me below error: RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800. 此代码还会向我显示以下错误:RuntimeError:找不到最佳参数:函数的调用次数已达到maxfev = 800。

The dimensionality of your data (referenced in the title to the question) is not the cause of the problem you are seeing. 数据的维数(在问题的标题中引用)不是造成问题的原因。 What you are trying to do is run 360*720 (~260,000) separate fits to your sigmoidal function, with inputs derived from your arrays x and y . 您想做的是运行360 * 720(〜260,000)的S型函数,并使用从数组xy派生的输入。 It should work, but it might be slow simply because you are doing so many fits. 它应该可以工作,但是它可能会很慢,因为您正在做很多事情。

If you haven't already, you should definitely start by fitting a couple arrays to your function -- if you can't get 3 to work, there's no point in trying 260,000, right? 如果还没有,那么绝对应该从为函数拟合几个数组开始-如果无法使3个起作用,那么尝试260,000没有意义,对吧? So, start with 1, then try 3, then 360, then all of them. 因此,从1开始,然后尝试3,然后是360,然后是所有。

I suspect the problem you are seeing is because curve_fit() stupidly allows you to not explicitly specify starting values for your parameters, and even-more-stupidly assigns unspecified starting values to the arbitrary value of 1. This encourages new users to not think more carefully about the problem they are trying to solve, and then gives cryptic error messages like the one you seeing that do not explicitly say "you need better starting values". 我怀疑您看到的问题是因为curve_fit()愚蠢地允许您显式指定参数的起始值,并且更愚蠢地将未指定的起始值分配给任意值1。这鼓励新用户不要再考虑更多仔细考虑他们要解决的问题,然后给出类似您看到的隐式错误消息,但不会明确指出“您需要更好的起始值”。 The message says the fit took many iterations, which probably means the fit "got lost" trying to find optimal values. 该消息说,拟合进行了多次迭代,这可能意味着拟合试图找到最佳值“迷失了”。 That "getting lost" is probably because it started "too far from home". 这种“迷路”可能是因为它“离家太远了”。

In general, curve fitting is sensitive to the starting values of the parameters. 通常,曲线拟合对参数的起始值敏感。 And you probably do know better starting values than a=1, b=1, c=1 . 而且您可能确实知道比a=1, b=1, c=1更好的起始值。 I suspect that you also know that exponentiation can get huge or tiny very quickly. 我怀疑您也知道幂运算可以很快变得很大或很小。 Because of this, and depending on the scale of your x , there are probably ranges of values for c that are not really sensible -- it might be that c should be positive, and smaller than 10, for example. 因此,取决于您的x的比例, c的值范围可能并不十分合理-例如, c可能为正,并且小于10。 Again, you probably sort of know these ranges. 同样,您可能有点了解这些范围。

Let me suggest using lmfit ( https://lmfit.github.io/lmfit-py/ ) for this work. 让我建议使用lmfithttps://lmfit.github.io/lmfit-py/ )进行这项工作。 It provides an alternative approach to curve-fitting, with many useful improvements over curve_fit . 它提供了一种曲线拟合的替代方法, curve_fit了许多有用的改进。 For your problem, a single fit might look like this: 对于您的问题,单次拟合可能如下所示:

import numpy as np
from lmfit import Model

def sigmoid(x, offset, scale, decay):
    return offset + scale*(1 - np.exp(-decay*(x**2)))

## set up the model and parameters from your model function
# note that parameters will be *named* using the names of the
# arguments to your model function.
model = Model(sigmoid)
# make parameters (OrderedDict-like) with initial values
params = model.make_params(offset=0, scale=1, decay=0.25) 

# you may want to set bounds on some of the parameters
params['scale'].min = 0
params['decay'].min = 0
params['decay'].max = 5

# you can also fix some parameters if desired
# params['offset'].vary = False 

## set up data
# pick arbitrary data to fit, and make sure data use np arrays.
# but also: (0, 0) isn't in your data -- do you need to assert it?
# won't that drive `offset` to 0?
i, j = 7, 12
xdata = np.array([0] + x[:, i, j])
ydata = np.array([0] + y[:, i, j])

# now fit model to data, get results
result = model.fit(params, ydata, x=xdata)

print(result.fit_report())

This will print out a report with fit statistics, best-fit parameter values, and uncertainties. 这将打印出具有拟合统计信息,最拟合参数值和不确定性的报告。 You can read the docs for all the components of results , but results.params holds best-fit parameters and uncertainties. 您可以阅读文档,了解results所有组成部分,但results.params包含最适合的参数和不确定性。

For use in a loop, this approach has the convenient feature that result is unique for each data set, while the starting params are not altered by the fit and can be re-used as starting values for all your fits. 对于循环使用,此方法具有便利的功能,即每个数据集的result都是唯一的,而起始params不会因拟合而改变,并且可以重新用作所有拟合的起始值。 A test loop might look like 一个测试循环可能看起来像

results = []
for i in (50, 150, 250):
    for j in (200, 400, 600):
        xdata = np.array([0] + x[:, i, j])
        ydata = np.array([0] + y[:, i, j])

        result = model.fit(params, ydata, x=xdata)
        results.append([i, j, result.params, result.chisqr])

It will still be possible that some of the 260,000 fits will not succeed, but I think that lmfit will give you better tools to avoid and identify these cases. 仍然有可能无法成功完成260,000个拟合中的某些拟合,但我认为lmfit将为您提供更好的工具来避免和识别这些情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM