簡體   English   中英

Theano ValueError:args與gemm中的尺寸不匹配; 2d數組維數解釋為1d

[英]Theano ValueError: dimension mismatch in args to gemm; 2d array dimension is interpreted as 1d

我正在嘗試使用Theano實現一個簡單的xnor神經網絡功能,我遇到了類型不匹配的情況

ValueError:參數與gemm(8,1)x(2,1)->(8,1)中的尺寸不匹配

盡管輸入的尺寸為(4X2),輸出的尺寸為(4X1),但我不知道為什么它將輸入尺寸讀為(8X1)。

它應該是(4,2)X(2,1)->(4,1),但有些人將其視為(8,1)x(2,1)->(8,1)

知道為什么,它將輸入維(n,m)讀為(n * m,1)嗎?

用於XNOR實現的簡單神經網絡:

print 'Importing Theano Library ...'
import theano
print 'Importing General Libraries ...'
import numpy as np
import theano.tensor as T
from theano import function
from theano import shared
from theano.ifelse import ifelse
import os
from random import random
import time

print(theano.config.device)

print 'Building Neural Network ...'
startTime = time.clock()
rng = np.random
#Define variables:
x = T.matrix('x')
w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
b1 = shared(np.asarray(1., dtype=theano.config.floatX))
b2 = shared(np.asarray(1., dtype=theano.config.floatX))
learning_rate = 0.01

a1 = 1/(1+T.exp(-T.dot(x,w1)-b1))
a2 = 1/(1+T.exp(-T.dot(x,w2)-b1))
x2 = T.stack([a1,a2],axis=1)
a3 = 1/(1+T.exp(-T.dot(x2,w3)-b2))

a_hat = T.vector('a_hat') #Actual output
cost = -(a_hat*T.log(a3) + (1-a_hat)*T.log(1-a3)).sum()
dw1,dw2,dw3,db1,db2 = T.grad(cost,[w1,w2,w3,b1,b2])

train = function(inputs = [x,a_hat], outputs = [a3,cost], updates = [[w1, w1-learning_rate*dw1],[w2, w2-learning_rate*dw2],[w3, w3-learning_rate*dw3],[b1, b1-learning_rate*b1],[b2, b2-learning_rate*b2]])

print 'Neural Network Built'
TimeDelta = time.clock() - startTime
print 'Building Time: %.2f seconds' %TimeDelta


inputs = np.array([[0,0],[0,1],[1,0],[1,1]]).astype(theano.config.floatX)
outputs = np.array([1,0,0,1]).astype(theano.config.floatX)

#Iterate through all inputs and find outputs:

print 'Training the network ...'
startTime = time.clock()
cost = []
print 'input shape', inputs.shape
print 'output shape', outputs.shape

for iteration in range(60000):
    print 'Iteration no. %d \r' %iteration,
    pred, cost_iter = train(inputs, outputs)
    cost.append(cost_iter)

TimeDelta = time.clock() - startTime
print 'Training Time: %.2f seconds' %TimeDelta

#Print the outputs:
print 'The outputs of the NN are: '

for i in range(len(inputs)):
    print 'The output for x1=%d | x2=%d is %.2f' % (inputs[i][0], inputs[i][1], pred[i])

predict = function([x],a3)

print predict([[0,0]])
print predict([[0,1]])
print predict([[1,0]])
print predict([[1,1]])

終端輸出:

Importing Theano Library ...
Using gpu device 0: NVIDIA Tegra X1 (CNMeM is enabled with initial size: 75.0% of memory, cuDNN 5005)
Importing General Libraries ...
gpu
Building Neural Network ...
Neural Network Built
Building Time: 1.78 seconds
Training the network ...
input shape (4, 2)
output shape (4,)
Traceback (most recent call last):
  File "neuron2.py", line 59, in <module>
    pred, cost_iter = train(inputs, outputs)
  File "/home/ubuntu/Theano/theano/compile/function_module.py", line 879, in __call__
    storage_map=getattr(self.fn, 'storage_map', None))
  File "/home/ubuntu/Theano/theano/gof/link.py", line 325, in raise_with_op
    reraise(exc_type, exc_value, exc_trace)
  File "/home/ubuntu/Theano/theano/compile/function_module.py", line 866, in __call__
    self.fn() if output_subset is None else\
ValueError: dimension mismatch in args to gemm (8,1)x(2,1)->(8,1)
Apply node that caused the error: GpuDot22(GpuReshape{2}.0, GpuReshape{2}.0)
Toposort index: 68
Inputs types: [CudaNdarrayType(float32, matrix), CudaNdarrayType(float32, matrix)]
Inputs shapes: [(8, 1), (2, 1)]
Inputs strides: [(1, 0), (1, 0)]
Inputs values: ['not shown', CudaNdarray([[ 0.14762458]
 [ 0.12991147]])]
Outputs clients: [[GpuReshape{3}(GpuDot22.0, Join.0)]]

HINT: Re-running with most Theano optimization disabled could give you a back-trace of when this node was created. This can be done with by setting the Theano flag 'optimizer=fast_compile'. If that does not work, Theano optimizations can be disabled with 'optimizer=None'.
HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

共享變量w1,w2,w3在轉換時被創建為矩陣,它們應為向量,轉換應按以下步驟進行:

這些行:

w1 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w2 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))
w3 = shared(np.array([rng.random(1).astype(theano.config.floatX), rng.random(1).astype(theano.config.floatX)]))

應該:

from random import random
w1 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))
w2 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))
w3 = shared(np.asarray([random(), random()], dtype=theano.config.floatX))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM