简体   繁体   English

如何在 scipy.optimize 中使用 fmin_cg 获得正确的尺寸

[英]How to get dimensions right using fmin_cg in scipy.optimize

I have been trying to use fmin_cg to minimize cost function for Logistic Regression.我一直在尝试使用 fmin_cg 来最小化逻辑回归的成本函数。

xopt = fmin_cg(costFn, fprime=grad, x0= initial_theta, 
                                 args = (X, y, m), maxiter = 400, disp = True, full_output = True )

This is how I call my fmin_cg这就是我如何称呼我的 fmin_cg

Here is my CostFn:这是我的 CostFn:

def costFn(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 0
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J.flatten()

Here is my grad:这是我的毕业生:

def grad(theta, X, y, m):
    h = sigmoid(X.dot(theta))
    J = 1 / m * np.sum((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg = 1 / m * (X.T.dot(h-y))
    return gg.flatten()

It seems to be throwing this error:它似乎抛出了这个错误:

/Users/sugethakch/miniconda2/lib/python2.7/site-packages/scipy/optimize/linesearch.pyc in phi(s)
     85     def phi(s):
     86         fc[0] += 1
---> 87         return f(xk + s*pk, *args)
     88 
     89     def derphi(s):

ValueError: operands could not be broadcast together with shapes (3,) (300,) 

I know it's something to do with my dimensions.我知道这与我的尺寸有关。 But I can't seem to figure it out.但我似乎无法弄清楚。 I am noob, so I might be making an obvious mistake.我是菜鸟,所以我可能犯了一个明显的错误。

I have read this link:我已阅读此链接:

fmin_cg: Desired error not necessarily achieved due to precision loss fmin_cg:由于精度损失不一定达到所需的误差

But, it somehow doesn't seem to work for me.但是,不知何故,它似​​乎对我不起作用。

Any help?有什么帮助吗?


Updated size for X,y,m,theta更新了 X,y,m,theta 的大小

(100, 3) ----> X (100, 3) ----> X

(100, 1) -----> y (100, 1) -----> y

100 ----> m 100 ----> 米

(3, 1) ----> theta (3, 1) ----> theta


This is how I initialize X,y,m:这就是我初始化 X,y,m 的方式:

data = pd.read_csv('ex2data1.txt', sep=",", header=None)                        
data.columns = ['x1', 'x2', 'y']                                                       
x1 = data.iloc[:, 0].values[:, None]                                                     
x2 = data.iloc[:, 1].values[:, None]                                                    
y = data.iloc[:, 2].values[:, None]
# join x1 and x2 to make one array of X
X = np.concatenate((x1, x2), axis=1)
m, n = X.shape

ex2data1.txt: ex2data1.txt:

34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
.....

If it helps, I am trying to re-code one of the homework assignments for the Coursera's ML course by Andrew Ng in python如果有帮助,我正在尝试重新编写 Andrew Ng 在 Python 中为 Coursera 的 ML 课程编写的作业之一

Finally, I figured out what the problem in my initial program was.最后,我弄清楚了我最初的程序中的问题是什么。

My 'y' was (100, 1) and the fmin_cg expects (100, ).我的 'y' 是 (100, 1) 并且 fmin_cg 期望 (100, )。 Once I flattened my 'y' it no longer threw the initial error.一旦我压平了我的“y”,它就不再抛出初始错误。 But, the optimization wasn't working still.但是,优化仍然不起作用。

 Warning: Desired error not necessarily achieved due to precision loss.
     Current function value: 0.693147
     Iterations: 0
     Function evaluations: 43
     Gradient evaluations: 41

This was the same as what I achieved without optimization.这与我在没有优化的情况下实现的相同。

I figured out the way to optimize this was to use the 'Nelder-Mead' method.我想出优化它的方法是使用“Nelder-Mead”方法。 I followed this answer: scipy is not optimizing and returns "Desired error not necessarily achieved due to precision loss"我遵循了这个答案: scipy 未优化并返回“由于精度损失,不一定达到所需的错误”

Result = op.minimize(fun = costFn, 
                x0 = initial_theta, 
                args = (X, y, m),
                method = 'Nelder-Mead',
                options={'disp': True})#,
                #jac = grad)

This method doesn't need a 'jacobian'.这种方法不需要“雅可比”。 I got the results I was looking for,我得到了我想要的结果,

Optimization terminated successfully.
     Current function value: 0.203498
     Iterations: 157
     Function evaluations: 287

Well, since I don't know exactly how your initializing m , X , y , and theta I had to make some assumptions.好吧,因为我不知道你到底是如何初始化mXytheta所以我不得不做出一些假设。 Hopefully my answer is relevant:希望我的回答是相关的:

import numpy as np
from scipy.optimize import fmin_cg
from scipy.special import expit

def costFn(theta, X, y, m):
    # expit is the same as sigmoid, but faster
    h = expit(X.dot(theta))

    # instead of 1/m, I take the mean
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    return J #should be a scalar


def grad(theta, X, y, m):
    h = expit(X.dot(theta))
    J =  np.mean((-(y * np.log(h))) - ((1-y) * np.log(1-h)))
    gg =  (X.T.dot(h-y))    
    return gg.flatten()

# initialize matrices
X = np.random.randn(100,3)
y = np.random.randn(100,) #this apparently needs to be a 1-d vector
m = np.ones((3,)) # not using m, used np.mean for a weighted sum (see ali_m's comment)
theta = np.ones((3,1))

xopt = fmin_cg(costFn, fprime=grad, x0=theta, args=(X, y, m), maxiter=400, disp=True, full_output=True )

While the code runs, I don't know enough about your problem to know if this is what you're looking for.在代码运行时,我对您的问题了解得不够多,无法知道这是否是您要查找的内容。 But hopefully this can help you understand the problem better.但希望这可以帮助您更好地理解问题。 One way to check your answer is to call fmin_cg with fprime=None and see how the answers compare.检查答案的一种方法是使用fprime=None调用fmin_cg并查看答案的比较情况。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM