简体   繁体   English

Python fsolve fprime 的形状是错误的吗?

[英]Python fsolve shape of fprime is wrong?

The following could not be much simpler, yet I cannot find how to make it work...下面再简单不过了,但我找不到如何使它工作......

from scipy.optimize import fsolve

def a(x):
    return x*x-x

def ap(x):
    return 2*x-1

#works
print(fsolve(a, 0.3))

#works
print(fsolve(a, 0.3, fprime=ap))

#works
print(fsolve(a, [0.3], fprime=ap))

#works
print(fsolve(a, [0.3, 0.7]))

#crashes
print(fsolve(a, [0.3, 0.7], fprime=ap))

When it crashes is gives the error当它崩溃时给出错误

TypeError: fsolve: there is a mismatch between the input and output shape of the 'fprime' argument 'ap'.Shape should be (2, 2) but it is (2,).

The output dimension of ap seems like it should definitely be the same as the input. ap的输出维度似乎肯定应该与输入相同。 How could this possibly be going wrong (and how do I fix it)?这怎么可能出错(我该如何解决)?

I think some downvoters are missing the subtlety of the question so here is a more in depth explanation of why I am confused:我认为一些downvoters错过了问题的微妙之处,所以这里有一个更深入的解释为什么我感到困惑:

It seems that scipy interprets a as a function of one variable with [0.3,0.7] starting estimates of two roots in fsolve(a, [0.3, 0.7]) , but in fsolve(a, [0.3, 0.7], fprime=ap) it interprets a as function of two variables with [.3,.7] being the estimate of a single root.似乎 scipy 将a解释为一个变量的函数,其中[0.3,0.7]开始估计fsolve(a, [0.3, 0.7])的两个根,但在fsolve(a, [0.3, 0.7], fprime=ap)它将a解释为两个变量的函数,其中[.3,.7]是单个根的估计。 According to the documentation ( https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fsolve.html ), the second argument根据文档( https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fsolve.html ),第二个参数

x0 : ndarray
    The starting estimate for the roots of func(x) = 0.

says that it is looking for estimate of the roots ( plural ).说它正在寻找根(复数)的估计。 But its behavior in the case where fprime is given makes it sound like x0 is interpreted as an estimate of a single root.但是它在给定fprime的情况下的行为使它听起来像x0被解释为单个根的估计。

I suspect you expected fsolve(a, [0.3, 0.7], fprime=ap) to return two solutions to the scalar problem.我怀疑您期望fsolve(a, [0.3, 0.7], fprime=ap)返回标量问题的两个解决方案。 fsolve does not work that way. fsolve不能那样工作。

When you call fsolve(a, x0, fprime=ap) , the fsolve function infers the dimensions of the problem from the shape of x0 .当您调用fsolve(a, x0, fprime=ap)fsolve函数会从x0的形状推断问题的维度。 If x0 is a scalar, it expects a to accept a scalar, and fprime must accept a scalar and return a scalar (or a 1x1 array).如果x0是标量,它期望a接受标量,而fprime必须接受标量并返回标量(或 1x1 数组)。 If x0 is a sequence of length 2 (as in your example that didn't work), fsolve expects a to accept an array of length 2 as its x argument and return a sequence of length 2. It also expects fprime to accept a array of length 2 and return an array (the Jacobian matrix) with shape 2x2.如果x0是长度为 2 的序列(如在您的示例中不起作用),则fsolve期望a接受长度为 2 的数组作为其x参数并返回长度为 2 的序列。它还期望fprime接受一个数组长度为 2 并返回一个形状为 2x2 的数组(雅可比矩阵)。

Thanks to everyone who explained why I was getting the behavior I was getting.感谢所有解释为什么我会得到我所得到的行为的人。 No one actually told me how to fix the issue though, so I'm writing this answer since I figured it out.不过,实际上没有人告诉我如何解决这个问题,所以自从我想通了,我就写下了这个答案。

The confusion stems from the fact that the scipy documentation is somewhere between wrong and misleading.混淆源于这样一个事实,即 scipy 文档介于错误和误导之间。

The documentation says:文档说:

Return the roots of the (non-linear) equations defined by func(x) = 0 given a starting estimate.在给定起始估计的情况下,返回由 func(x) = 0 定义的(非线性)方程的根。

suggesting that for a scalar function it may return an array of roots.暗示对于标量函数,它可能返回一个根数组。 The actual behavior is closer to实际行为更接近

Returns a root of the (non-linear) equation defined by func(x) = 0 given a starting estimate.在给定起始估计的情况下,返回由 func(x) = 0 定义的(非线性)方程的根。

This would have made what to do obvious.这将使该做什么变得显而易见。 To find multiple roots of the equation, we can let scipy (as it always does) interpret a(x) as a function of n variables, where n is the length of x .为了找到方程的多个根,我们可以让 scipy(一如既往)将a(x)解释为n变量的函数,其中nx的长度。 To find multiple roots let a be vectorized, ie要找到多个根,让a被向量化,即

a([x1, x2, ..., xn]) = [a(x1), a(x2), ..., a(xn)] . a([x1, x2, ..., xn]) = [a(x1), a(x2), ..., a(xn)]

The jacobian of this function is a diagonal matrix这个函数的雅可比是一个对角矩阵

diag([ap(x1), ap(x2), ..., ap(xn)]) . diag([ap(x1), ap(x2), ..., ap(xn)])

Thus the answer to my question is :因此,我的问题的答案是

Change改变

print(fsolve(a, [0.3, 0.7], fprime=ap))

to

print(fsolve(a, [0.3, 0.7], fprime=lambda x: np.diag(ap(x))))
# correctly outputs
# [ -3.69321143e-16   1.00000000e+00]

or simply loop over [0.3, 0.7] calling fsolve once per initial guess.或者简单地循环[0.3, 0.7]每次初始猜测调用fsolve一次。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM