简体   繁体   English

当我向theano函数中的给定参数发送numpy数组时,为什么会出现此Theano TypeError

[英]When I send a numpy array to a givens parameter in a theano function, why do I get this Theano TypeError

I am new to theano, and am still struggling with the"pseudo code" style of theano on the one hand and strict type checking on the other. 我是theano的新手,一方面仍在努力解决theano的“伪代码”风格,另一方面仍在进行严格的类型检查。 I am more of a C programmer and a python programmer. 我更多是C程序员和python程序员。 Can someone please point out where I am going wrong in this example code which uses mean square error between predicted y points and training y points for x values, to get the optimal slope and intercept of a linear fit? 有人可以指出在此示例代码中哪里出了问题,该示例代码使用预测的y点与训练y点之间的均方误差作为x值,以获得线性拟合的最佳斜率和截距?

The code is below: 代码如下:

import numpy as np
import theano
import theano.tensor as T
from collections import OrderedDict

class LinearModel:
    def __init__(self,num_points):
        self.m = theano.shared(value=0.1,name='m')
        self.b = theano.shared(value=1, name='b')
        self.params = [self.m, self.b]

        def step(x_t):
            y_t = self.m * x_t + self.b
            return y_t

        self.x = T.matrix('x',dtype=theano.config.floatX)
        self.y, _ = theano.scan(
                        fn=step,
                        sequences=self.x,
                    ) 

        self.loss = lambda y_train: self.mse(y_train)

    def mse(self, y_train):
        return T.mean((self.y - y_train) ** 2)

    def fit(self,x, y, learning_rate=0.01, num_iter=100):
        trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)
        trainset_y = theano.tensor._shared(y.astype(np.dtype(np.float32)),borrow=True)
        n_train = trainset_x.get_value(borrow=True).shape[0]

        cost = self.loss(trainset_y)
        gparams = T.grad(cost,self.params)

        l_r = T.scalar('l_r', dtype=theano.config.floatX)

        updates = OrderedDict()
        for param,gparam in zip(self.params,gparams):
            updates[param] = param - l_r * gparam

        self.train_model = theano.function(  inputs=[l_r],
                                        outputs=[cost,self.y],
                                        updates=updates,
                                        givens={
                                              self.x: trainset_x,
                                            }
                                        )

        epoch = 0
        while epoch < num_iter:
            cost, _ = self.train_model(learning_rate)
            m = self.m.get_value()
            b = self.b.get_value()
            print "epoch: ",epoch," cost: ",cost," m: ",m," b: ",b


if __name__ == '__main__':
    lin = LinearModel(10)
    x = np.arange(10)
    y = np.random.rand(10)
    lin.fit(x,y,learning_rate=0.01,num_iter=100)

The error is: 错误是:

Traceback (most recent call last): File "~/EclipseWorkspace/MemoryNetworkQA.Theano/linear_regression.py", line 70, in lin.fit(x,y,learning_rate=0.01,num_iter=100) File "~/EclipseWorkspace/MemoryNetworkQA.Theano/linear_regression.py", line 54, in fit self.x: trainset_x, File "/usr/local/lib/python2.7/dist-packages/theano/compile/function.py", line 266, in function profile=profile) File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 489, in pfunc no_default_updates=no_default_updates) File "/usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py", line 217, in rebuild_collect_shared raise TypeError(err_msg, err_sug) 追溯(最近一次通话):文件“〜/ EclipseWorkspace / MemoryNetworkQA.Theano / linear_regression.py”,第70行,位于lin.fit(x,y,learning_rate = 0.01,num_iter = 100)文件“〜/ EclipseWorkspace / MemoryNetworkQA .theano / linear_regression.py”,第54行,适合self.x:trainset_x,文件“ /usr/local/lib/python2.7/dist-packages/theano/compile/function.py”,第266行,在函数中profile = profile)文件“ /usr/local/lib/python2.7/dist-packages/theano/compile/pfunc.py”,行489,在pfunc中no_default_updates = no_default_updates)文件“ / usr / local / lib / python2。 7 / dist-packages / theano / compile / pfunc.py“,第217行,位于rebuild_collect_shared中,引发TypeError(err_msg,err_sug)

TypeError: ('An update must have the same type as the original shared variable (shared_var=b, shared_var.type=TensorType(int64, scalar), update_val=Elemwise{sub,no_inplace}.0, update_val.type=TensorType(float64, scalar)).', 'If the difference is related to the broadcast pattern, you can call the tensor.unbroadcast(var, axis_to_unbroadcast[, ...]) function to remove broadcastable dimensions.') TypeError :('更新必须与原始共享变量具有相同的类型(shared_var = b,shared_var.type = TensorType(int64,标量),update_val = Elemwise {sub,no_inplace} .0,update_val.type = TensorType(float64 ,标量))。','如果差异与广播模式有关,则可以调用tensor.unbroadcast(var,axis_to_unbroadcast [,...])函数删除可广播的尺寸。')

This code won't execute until the following problems have been addressed. 在解决以下问题之前,该代码将不会执行。

  1. The error reported in the question is due to the type of self.b not matching the type of the update for self.b . 在问题报告的错误是由于类型self.b不匹配的更新类型self.b self.b has no type specified so one has been inferred. self.b没有指定类型,因此可以推断出一个。 The initial value is a Python integer so the inferred type is an int64 . 初始值为Python整数,因此推断的类型为int64 The update is a floatX because the learning rate is a floatX . 由于学习率是floatX所以更新为floatX You can't update an int64 with a floatX . 您不能使用floatX更新int64 The solution is to make the initial value a Python float resulting in an inferred floatX type. 解决方案是将初始值设为Python浮点数,从而得出推断的floatX类型。 Change self.b = theano.shared(value=1, name='b') to self.b = theano.shared(value=1., name='b') (note the decimal point after the 1 ). self.b = theano.shared(value=1, name='b')更改为self.b = theano.shared(value=1., name='b') (注意1后面的小数点)。

  2. The next problem is that self.x is defined as a matrix but the value passed in the function call in the last line is a vector. 下一个问题是self.x被定义为矩阵,但是在最后一行的函数调用中传递的值是向量。 A solution is to reshape x into a matrix, eg change x = np.arange(10) to x = np.arange(10).reshape(1,10) . 一种解决方案是将x重塑为矩阵,例如将x = np.arange(10)更改为x = np.arange(10).reshape(1,10)

  3. The trainset shared variables have type float32 but this conflicts with other areas of the code that is working with floatX . 训练集共享变量的类型为float32但这与使用floatX的代码的其他区域冲突。 If your floatX=float32 then there should be no problem but it would be safer to simply use floatX to maintain the same float type throughout. 如果您的floatX=float32则应该没有问题,但是更简单地使用floatX来始终保持相同的float类型会更安全。 Change trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True) to trainset_x = theano.tensor._shared(x.astype(theano.config.floatX),borrow=True) and similarly for trainset_y . trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)更改trainset_x = theano.tensor._shared(x.astype(theano.config.floatX),borrow=True) ,同样适用于trainset_y

  4. The number of epochs is not currently having any effect because epoch is not being incremented. 当前的时期数没有任何效果,因为没有增加epoch Change while epoch < num_iter: to for epoch in xrange(num_iter): and remove epoch = 0 . while epoch < num_iter:更改for epoch in xrange(num_iter):中的epoch = 0并删除epoch = 0

In addition, 此外,

  • The parameters look like they're not updating but this is a false view. 参数看起来好像没有更新,但这是一个错误的视图。 The iterations pass quickly and never stop because of problem 4 above and the learning rate is large enough to make the model converge very quickly. 由于上面的问题4,迭代很快通过并且永不停止,并且学习率足够大,可以使模型非常快速地收敛。 Try changing the learning rate to something much smaller, eg 0.0001, and look at the output for only the first 100 epochs. 尝试将学习率更改为较小的值,例如0.0001,然后仅查看前100个时期的输出。

  • I'd recommend avoiding the use of theano.tensor._shared unless you really do need to force the shared variable to be allocated on the CPU when device=gpu . 我建议避免使用theano.tensor._shared除非您确实需要在device=gpu时强制将共享变量分配到CPU上。 The preferred method is theano.shared . 首选方法是theano.shared

  • The n_train variable isn't used anywhere. n_train变量没有在任何地方使用。

  • You're using givens inconsistently. 您正在使用givens不一致。 I'd recommend using it for both x and y , or for neither. 我建议将其同时用于xy ,或者都不使用。 Take a look at the logistic regression tutorial for more pointers on this. 看看Logistic回归教程,以获取更多有关此方面的指导。

  • The Theano function is being recompiled on every call to fit but you'd be better off compiling it only once and reusing it on each fit . Theano函数在每次fit调用时都会重新编译,但最好只编译一次,并在每次fit重新使用。

  • This model can be implemented without using scan . 无需使用scan即可实现此模型。 In general, scan is often only needed when the output of a step is a function of the output from an earlier step. 通常,通常仅在步骤的输出是先前步骤的输出的函数时才需要scan scan is also generally much slower than alternatives and should be avoided when possible. scan通常也比其他方法慢得多,应尽可能避免。 You can remove scan by using self.y = self.m * self.x + self.b instead. 您可以self.y = self.m * self.x + self.b使用self.y = self.m * self.x + self.b删除scan

  • If you do use scan, it's good practice to enable strict mode, via strict=True in the scan call. 如果您确实使用扫描,则最好通过在scan调用中使用strict=True启用严格模式。

  • It's good practice to explicitly provide types for all shared variables. 最好为所有共享变量提供类型。 You do this for trainset_x and trainset_y but not for self.m and self.b . 你做到这一点的trainset_xtrainset_y但不self.mself.b

Okay, I found that the problem was really in the self.b. 好的,我发现问题确实出在自我上。b。 After initializing it with an explicit float, the type error goes away. 用显式float对其进行初始化后,类型错误消失了。

But the slope and intercepts (self.m and self.b) which are still theano shared variables and are being passed in through updates, are not really getting updated. 但是仍然是theano共享变量并通过更新传递的斜率和截距(self.m和self.b)并没有真正得到更新。 If anyone can tell me why, it will be a great help. 如果有人能告诉我原因,那将是很大的帮助。 Thanks. 谢谢。

import numpy as np
import theano
import theano.tensor as T
from collections import OrderedDict

class LinearModel:
    def __init__(self,num_points):
        self.m = theano.shared(value=0.1,name='m')
        self.b = theano.shared(value=1.0, name='b')
        self.params = [self.m, self.b]

        def step(x_t):
            y_t = self.m * x_t + self.b
            return y_t

        #self.x = T.matrix('x',dtype=theano.config.floatX)
        #self.x = T.dmatrix('x')
        self.x = T.vector('x',dtype=theano.config.floatX)
        self.y, _ = theano.scan(
                        fn=step,
                        sequences=self.x,
                    ) 

        self.loss = lambda y_train: self.mse(y_train)

    def mse(self, y_train):
        return T.mean((self.y - y_train) ** 2)

    def fit(self,x, y, learning_rate=0.01, num_iter=100):
        trainset_x = theano.tensor._shared(x.astype(np.dtype(np.float32)),borrow=True)
        trainset_y = theano.tensor._shared(y.astype(np.dtype(np.float32)),borrow=True)
        n_train = trainset_x.get_value(borrow=True).shape[0]

        cost = self.loss(trainset_y)
        gparams = T.grad(cost,self.params)

        l_r = T.scalar('l_r', dtype=theano.config.floatX)

        updates = OrderedDict()
        for param,gparam in zip(self.params,gparams):
            updates[param] = param - l_r * gparam

        self.train_model = theano.function(  inputs=[l_r],
                                        outputs=[cost,self.y],
                                        updates=updates,
                                        givens={
                                              self.x: trainset_x,
                                            }
                                        )


        epoch = 0
        while epoch < num_iter:
            cost, _ = self.train_model(learning_rate)
            m = self.m.get_value()
            b = self.b.get_value()
            print "epoch: ",epoch," cost: ",cost," m: ",m," b: ",b
            epoch += 1


if __name__ == '__main__':
    lin = LinearModel(10)
    x = np.array([1.0,2.0,3.0,4.0,5.0,6.0,7.0,8.0,9.0,10.0])
    y = np.random.rand(10)
    lin.fit(x,y,learning_rate=0.01,num_iter=100)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM