Theano ValueError：args與gemm中的尺寸不匹配； 2d數組維數解釋為1d

Question

我正在嘗試使用Theano實現一個簡單的xnor神經網絡功能，我遇到了類型不匹配的情況

ValueError：參數與gemm（8,1）x（2,1）->（8,1）中的尺寸不匹配

盡管輸入的尺寸為（4X2），輸出的尺寸為（4X1），但我不知道為什么它將輸入尺寸讀為（8X1）。

它應該是（4,2）X（2,1）->（4,1），但有些人將其視為（8,1）x（2,1）->（8,1）

知道為什么，它將輸入維（n，m）讀為（n * m，1）嗎？

用於XNOR實現的簡單神經網絡：

print 'Importing Theano Library ...'
import theano
print 'Importing General Libraries ...'
import numpy as np
import theano.tensor as T
from theano import function
from theano import shared
from theano.ifelse import ifelse
import os
from random import random
import time

print(theano.config.device)

print 'Building Neural Network ...'
startTime = time.clock()
rng = np.random
#Define variables:
x = T.matrix('x')
w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
b1 = shared(np.asarray(1., dtype=theano.config.floatX))
b2 = shared(np.asarray(1., dtype=theano.config.floatX))
learning_rate = 0.01

a1 = 1/(1+T.exp(-T.dot(x,w1)-b1))
a2 = 1/(1+T.exp(-T.dot(x,w2)-b1))
x2 = T.stack([a1,a2],axis=1)
a3 = 1/(1+T.exp(-T.dot(x2,w3)-b2))

a_hat = T.vector('a_hat') #Actual output
cost = -(a_hat*T.log(a3) + (1-a_hat)*T.log(1-a3)).sum()
dw1,dw2,dw3,db1,db2 = T.grad(cost,[w1,w2,w3,b1,b2])

train = function(inputs = [x,a_hat], outputs = [a3,cost], updates = [[w1, w1-learning_rate*dw1],[w2, w2-learning_rate*dw2],[w3, w3-learning_rate*dw3],[b1, b1-learning_rate*b1],[b2, b2-learning_rate*b2]])

print 'Neural Network Built'
TimeDelta = time.clock() - startTime
print 'Building Time: %.2f seconds' %TimeDelta


inputs = np.array([[0,0],[0,1],[1,0],[1,1]]).astype(theano.config.floatX)
outputs = np.array([1,0,0,1]).astype(theano.config.floatX)

#Iterate through all inputs and find outputs:

print 'Training the network ...'
startTime = time.clock()
cost = []
print 'input shape', inputs.shape
print 'output shape', outputs.shape

for iteration in range(60000):
    print 'Iteration no. %d \r' %iteration,
    pred, cost_iter = train(inputs, outputs)
    cost.append(cost_iter)

TimeDelta = time.clock() - startTime
print 'Training Time: %.2f seconds' %TimeDelta

#Print the outputs:
print 'The outputs of the NN are: '

for i in range(len(inputs)):
    print 'The output for x1=%d | x2=%d is %.2f' % (inputs[i][0], inputs[i][1], pred[i])

predict = function([x],a3)

print predict([[0,0]])
print predict([[0,1]])
print predict([[1,0]])
print predict([[1,1]])

終端輸出：

Importing Theano Library ...
Using gpu device 0: NVIDIA Tegra X1 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 5005)
Importing General Libraries ...
gpu
Building Neural Network ...
Neural Network Built
Building Time: 1.78 seconds
Training the network ...
input shape (4, 2)
output shape (4,)
Traceback (most recent call last):
  File "neuron2.py", line 59, in <module>
    pred, cost_iter = train(inputs, outputs)
  File "/home/ubuntu/Theano/theano/compile/function_module.py", line 879, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/home/ubuntu/Theano/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/home/ubuntu/Theano/theano/compile/function_module.py", line 866, in __call__
    self.fn() if output_subset is None else\
ValueError: dimension mismatch in args to gemm (8,1)x(2,1)->(8,1)
Apply node that caused the error: GpuDot22(GpuReshape{2}.0, GpuReshape{2}.0)
Toposort index: 68
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(8, 1), (2, 1)]
Inputs strides: [(1, 0), (1, 0)]
Inputs values: ['not shown', CudaNdarray([[ 0.14762458]
 [ 0.12991147]])]
Outputs clients: [[GpuReshape{3}(GpuDot22.0, Join.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

Answer 1

共享變量w1，w2，w3在轉換時被創建為矩陣，它們應為向量，轉換應按以下步驟進行：

這些行：

w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))

應該：

from random import random
w1 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))
w2 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))
w3 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))

Theano ValueError：args與gemm中的尺寸不匹配； 2d數組維數解釋為1d

問題描述

1 個解決方案

解決方案1
0 2016-09-20 15:18:40

Theano ValueError：args與gemm中的尺寸不匹配； 2d數組維數解釋為1d

問題描述

1 個解決方案

解決方案1 0 2016-09-20 15:18:40

解決方案1
0 2016-09-20 15:18:40