简体   繁体   English

Keras中的尺寸错误

[英]Error with dimensions in Keras

I am tying to implement a simple word2vec model but I get the following error 我想实现一个简单的word2vec模型,但出现以下错误

ValueError: Error when checking target: expected dense-softmax to have 3 dimensions, but got array with shape (32, 14).

the variable train_x and train_y are 32 lines of the form 变量train_xtrain_y是以下形式的32行

[[0 0 0 0 0 0 0 0 0 1 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 1 0 0 0 0 0 0 0 0 0]
                          ...]]

and the python code is the following python代码如下

vocal_size = 14
input = Input(shape=(vocal_size, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim= vocal_size)(input)
output = Dense(vocal_size, use_bias=False, activation='softmax')(embeddings)
model = Model(input=input, output=output)
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.summary()
model.fit(train_x, train_y)



_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           (None, 14)                0         
_________________________________________________________________
embeddings (Embedding)       (None, 14, 5)             70        
_________________________________________________________________
dense_1 (Dense)              (None, 14, 14)            70        
=================================================================
Total params: 140
Trainable params: 140
Non-trainable params: 0

Edit: 编辑:

("I like stackoverflow") with context size 1, I create the following tuples, (“我喜欢stackoverflow”),上下文大小为1,创建以下元组,
("I", "like"), ("like", "I") , ("like", "stackoverflow"), ("stackoverflow", "like") (“ I”,“ like”),(“ like”,“ I”),(“ like”,“ stackoverflow”),(“ stackoverflow”,“ like”)

Then I do an one-hot encoding of all of them and feed them to the model. 然后,我对所有这些代码进行一次热编码,并将其输入模型。

train_x[0] -> is one hot encoding of the word "I" train_x [0]->是单词“ I”的一种热编码
train_y[0] -> is one hot encoding of the context word "like" train_y [0]->是上下文单词“ like”的一种热编码

Edit 2 编辑2

Using the first encoding for skip-gram: Treating 0 as a special word (ie not top 10.000 most frequent) and start the counting from 1. I assume I should give as an input a single number and output a one-hot encoding ie ("stack", "overflow"), input [3] ("stack") and the output [0,0,0,0,1,0,0,0,0,0,0] ("overflow"). 使用第一个用于跳过语法的编码:将0视为一个特殊单词(即,不是最频繁的前10.000个单词),并从1开始计数。我假设我应该给一个输入一个数字并输出一个热编码,即( “堆栈”,“溢出”),输入[3] (“堆栈”)和输出[0,0,0,0,1,0,0,0,0,0,0] (“溢出”)。

Input(shape=(1,)..) -> 
Embedding(output_dim=embedding_size, input_dim=vocab_size, mask_zero=True, ...) -> 
Dense(vocab_size+1, activation="Softmax")
model.compile(optimizer='SGD', loss='categorical_crossentropy')

Ie embedding_size = 5, input the sentences in your example, 即embedding_size = 5,在您的示例中输入句子,

https://imgur.com/a/32m4z https://imgur.com/a/32m4z

Thanks for the edit. 感谢您的修改。 You're running into trouble for two reasons, one shallow, one deep. 您遇到麻烦的原因有两个,一个浅,一个深。 First: shallow, the dense layer requires a three-dimensional input, but the embedding is two dimensional. 首先:较浅,致密层需要三维输入,但嵌入是二维的。 You can fix this with a Flatten : 您可以使用Flatten修复此问题:

input = Input(shape=(vocal_size, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim=vocal_size+1, input_length=vocal_size)(input)
flat = Flatten(embeddings)
output = Dense(vocal_size, use_bias=False, activation='softmax')(flat)

The deep is because one-hot encoding and embedding are two options that serve the same purpose, so you don't need both (see here and here ). 深刻的原因是,一键编码和嵌入是实现相同目的的两个选项,因此您不需要两者(请参阅此处此处 )。

The embedding layer wants a series of "sentences" made of integers representing words (or tuples) and a vocabulary size, so something like 嵌入层需要一系列由表示单词(或元组)和词汇量的整数组成的“句子”,所以类似

['Welcome to stack overflow',
'stack overflow is great',
'Hope it's helpful to you']

would be represented as 将表示为

[[1,2,3,4,0],[3,4,5,6,0],[7,8,9,2,10]] 
# 0s are there to "pad" sentences 1 & 2 as they all need to be the same length

and fed into an embedding layer like this: 并输入到这样的嵌入层中:

input = Input(shape=(5, ), dtype='int32')
embeddings = Embedding(output_dim=5, input_dim=11, input_length=5)(input)
#input dim is 11 because we want 1 more than the number of words in our vocabulary
#padding can be done with the keras function pad_sequences

I'm certain you know, the one hot encoding of our sentences would look like this: 我确定您知道,我们句子的一种热门编码如下所示:

[[1,1,1,1,0,0,0,0,0,0],
 [0,0,1,1,1,1,0,0,0,0],
 [0,1,0,0,0,0,1,1,1,1]]

Because the sentences have already been transformed (one hot has "embedded" our sentences as binary vectors in 10-dimensional space), we can feed this directly into a Dense layer without the need for further embedding: 因为句子已经被转换(一个热点已经将句子作为10维空间中的二进制向量“嵌入”了),所以我们可以将其直接馈入Dense层,而无需进一步嵌入:

input = Input(shape=(vocal_size, ), dtype='int32', name='input')
output = Dense(vocal_size, use_bias=False, activation='softmax')(input)

Here's a functional toy example using both ways: 这是同时使用两种方法的实用玩具示例:

from keras.layers import Dense,Activation,Embedding,Input,Flatten
from keras import Model
import numpy as np

wrords = ['Welcome to stack overflow',
    'stack overflow is great',
    'Hope it\'s helpful to you']

a = [[1,2,3,4,0],[3,4,5,6,0],[7,8,9,2,10]]
b = [[1,1,1,1,0,0,0,0,0,0],
 [0,0,1,1,1,1,0,0,0,0],
 [0,1,0,0,0,0,1,1,1,1]]
c = [1,1,0] #hypothetical target is "references stack overflow"

input = Input(shape=(5, ), dtype='int32', name='input')
embeddings = Embedding(output_dim=5, input_dim=11, input_length=5)(input)
flat = Flatten()(embeddings)
output = Dense(1, activation='softmax')(flat)
model = Model(input=input, output=output)
model.compile(optimizer='adam', loss='binary_crossentropy')
model.summary()
model.fit(np.array(a),np.array(c))

input2 = Input(shape=(10, ), dtype='float32')
output2 = Dense(1, activation='softmax')(input2)
model2 = Model(input=input2, output=output2)
model2.compile(optimizer='adam', loss='binary_crossentropy')
model2.summary()
model2.fit(np.array(b),np.array(c))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM