SciPy + Numpy：找到S形曲线的斜率

Question

I have some data that follow a sigmoid distribution as you can see in the following image: 如下面的图像所示，我有一些数据呈S型分布： 2003年的S型数据

After normalizing and scaling my data, I have adjusted the curve at the bottom using scipy.optimize.curve_fit and some initial parameters: 标准化和缩放数据后，我使用scipy.optimize.curve_fit和一些初始参数调整了底部的曲线：

popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0 = [0.05, 0.05, 0.05])
>>> print popt
[  2.82019932e+02  -1.90996563e-01   5.00000000e-02]

So popt , according to the documentation , returns *"Optimal values for the parameters so that the sum of the squared error of f(xdata, popt) - ydata is minimized" . 因此，根据文档， popt返回*“参数的最佳值，以使f（xdata， popt）-ydata的平方误差之和最小” 。 I understand here that there is no calculation of the slope with curve_fit , because I do not think the slope of this gentle curve is 282, neither is negative. 我在这里了解到没有使用curve_fit来计算斜率，因为我不认为该缓和曲线的斜率是282，也不是负数。

Then I tried with scipy.optimize.leastsq , because the documentation says it returns "The solution (or the result of the last iteration for an unsuccessful call). ", so I thought the slope would be returned. 然后，我尝试使用scipy.optimize.leastsq ，因为文档说它返回“解决方案（或调用失败的最后一次迭代的结果）。 ”，因此我认为将返回斜率。 Like this: 像这样：

p, cov, infodict, mesg, ier = leastsq(residuals, p_guess, args = (nxdata, nydata), full_output=True)
>>> print p
Param(x0=281.73193626250207, y0=-0.012731420027056234, c=1.0069006606656596, k=0.18836680131910222)

But again, I did not get what I expected. 但是同样，我没有达到我的期望。 curve_fit and leastsq returned almost the same values, with is not surprising I guess, as curve_fit is using an implementation of the least squares method within to find the curve. curve_fit和leastsq返回几乎相同的值，我猜不足为奇，因为curve_fit使用内部最小二乘法的实现来找到曲线。 But no slope back...unless I overlooked something. 但是没有后退...除非我忽略了一些东西。

So, how to calculate the slope in a point, say, where X = 285 and Y = 0.5? 那么， 如何计算某个点的斜率，例如X = 285，Y = 0.5？

I am trying to avoid manual methods, like calculating the derivative in, say, (285.5, 0.55) and (284.5, 0.45) and subtract and divide results and so. 我试图避免使用人工方法，例如计算（285.5，0.55）和（284.5，0.45）中的导数，然后减去和除以结果。 I would like to know if there is a more automatic method for this. 我想知道是否有更自动的方法。

Thank you all! 谢谢你们！

EDIT #1 编辑＃1

This is my "sigmoid_function", used by curve_fit and leastsq methods: 这是我的“ sigmoid_function”，由curve_fit和minimumsq方法使用：

def sigmoid_function(xdata, x0, k, p0): # p0 not used anymore, only its components (x0, k)
    # This function is called by two different methods: curve_fit and leastsq,
    # this last one through function "residuals". I don't know if it makes sense
    # to use a single function for two (somewhat similar) methods, but there 
    # it goes.

    # p0:
    #   + Is the initial parameter for scipy.optimize.curve_fit. 
    #   + For residuals calculation is left empty
    #   + It is initialized to [0.05, 0.05, 0.05]
    # x0:
    #   + Is the convergence parameter in X-axis and also the shift
    #   + It starts with 0.05 and ends up being around ~282 (days in a year)
    # k:
    #   + Set up either by curve_fit or leastsq
    #   + In least squares it is initially fixed at 0.5 and in curve_fit
    #   + to 0.05. Why? Just did this approach in two different ways and 
    #   + it seems it is working. 
    #   + But honestly, I have no clue on what it represents
    # xdata: 
    #   + Positions in X-axis. In this case from 240 to 365

# Finally I changed those parameters as suggested in the answer. 
# Sigmoid curve has 2 degrees of freedom, therefore, the initial 
# guess only needs to be this size. In this case, p0 = [282, 0.5]


    y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
    return y

def residuals(p_guess, xdata, ydata):
    # For the residuals calculation, there is no need of setting up the initial parameters
    # After fixing the initial guess and sigmoid_function header, remove [] 
    # return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])
    return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])

I am sorry if I made mistakes while describing the parameters or confused technical terms. 如果在描述参数或混淆技术术语时出错，我感到抱歉。 I am very new with numpy and I have not studied maths for years, so I am catching up again. 我对numpy很陌生，而且我已经好多年没有学习数学了，所以我又开始追赶。

So, again, what is your advice to calculate the slope of X = 285, Y = 0.5 (more or less the midpoint) for this dataset? 那么，对于该数据集，您如何建议计算X = 285，Y = 0.5（或多或少的中点）的斜率呢？ Thanks!! 谢谢！！

EDIT #2 编辑＃2

Thanks to Oliver W., I updated my code as he suggested and understood a bit better the problem. 感谢Oliver W.，我按照他的建议更新了代码，并更好地理解了这个问题。

There is a final detail I do not fully get. 我还没有完全了解最后一个细节。 Apparently, curve_fit returns a popt array (x0, k) with the optimum parameters for the fitting: 显然， curve_fit返回一个popt数组（x0，k），带有用于拟合的最佳参数：

x0 seems to be how shifted is the curve by indicating the central point of the curve x0似乎是通过指示曲线的中心点来改变曲线的方向
k parameter is the slope when y = 0.5, also in the center of the curve (I think!) k参数是y = 0.5时的斜率，也位于曲线的中心（我认为！）

Why if the sigmoid function is a growing one, the derivative/slope in popt is negative? 如果S型函数是一个增长的函数，为什么popt中的导数/斜率是负数？ Does it make sense? 是否有意义？

I used sigmoid_derivative to calculate the slope and, yes, I obtained the same results that popt but with positive sign. 我使用sigmoid_derivative计算斜率，是的，我获得了与popt相同的结果，但带有正号。

# Year 2003, 2005, 2007. Slope in midpoint.
k = [-0.1910, -0.2545, -0.2259] # Values coming from popt
slope = [0.1910, 0.2545, 0.2259] # Values coming from sigmoid_derivative function

I know this is being a bit peaky because I could use both. 我知道这有点尖锐，因为我可以同时使用两者。 The relevant data is in there but with negative sign, but I was wondering why is this happening. 相关数据在里面，但带有负号，但我想知道为什么会这样。

So, the calculation of the derivative function as you suggested, is only required if I need to know the slope in other points than y = 0.5 . 因此，仅当我需要知道y = 0.5以外的其他点的斜率时，才需要计算您建议的导数函数。 Only for midpoint, I can use popt . 仅对于中点，我可以使用popt 。

Thanks for your help, it saved me a lot of time. 感谢您的帮助，它节省了我很多时间。 :-) :-)

Answer 1

You're never using the parameter p0 you're passing to your sigmoid function. 您永远不会使用要传递给S型函数的参数p0 。 Hence, curve fitting will not have any good measure to find convergence, because it can take any value for this parameter. 因此，曲线拟合将没有任何好的方法来找到收敛性，因为它可以为该参数取任何值。 You should first rewrite your sigmoid function like this: 您应该首先像这样重写Sigmoid函数：

def sigmoid_function(xdata, x0, k):

    y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
    return y

This means your model (the sigmoid) has only two degrees of freedom. 这意味着您的模型（S型）只有两个自由度。 This will be returned in popt : 这将在popt返回：

initial_guess = [282, 1]  # (x0, k): at x0, the sigmoid reaches 50%, k is slope related
popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0=initial_guess)

Now popt will be a tuple (or array of 2 values), being the best possible x0 and k . 现在popt将是一个元组（或2个值的数组），是可能的最佳x0和k 。

To get the slope of this function at any point, to be honest, I would just calculate the derivative symbolically as the sigmoid is not such a hard function. 老实说，要在任何点获得该函数的斜率，我只能用符号方式计算导数，因为S形不是那么难的函数。 You will end up with: 您最终将得到：

def sigmoid_derivative(x, x0, k):
    f = np.exp(-k*(x-x0))
    return -k / f

If you have the results from your curve fitting stored in popt , you could pass this easily to this function: 如果将曲线拟合的结果存储在popt ，则可以轻松地将其传递给此函数：

print(sigmoid_derivative(285, *popt))

which will return for you the derivative at x=285 . 这将为您返回x=285的导数。 But, because you ask specifically for the midpoint, so when x==x0 and y==.5 , you'll see (from the sigmoid_derivative) that the derivative there is just -k , which can be observed immediately from the curve_fit output you've already obtained. 但是，因为您专门要求中点，所以当x==x0和y==.5 ，您会看到（从sigmoid_derivative中得出）导数仅为-k ，可以从curve_fit输出中立即观察到您已经获得了。 In the output you've shown, that's about 0.19. 在显示的输出中，大约为0.19。

SciPy + Numpy：找到S形曲线的斜率

问题描述

1 个解决方案

解决方案1
2 已采纳 2014-10-28 14:39:00

SciPy + Numpy：找到S形曲线的斜率

问题描述

1 个解决方案

解决方案1 2 已采纳 2014-10-28 14:39:00

解决方案1
2 已采纳 2014-10-28 14:39:00