简体   繁体   中英

tensorflow: Initializer for variable in LSTM cell

I am trying to build an RNN to predict the sentiment of input data as positive or negative.

tf.reset_default_graph()

input_data = tf.placeholder(tf.int32, [batch_size, 40])
labels = tf.placeholder(tf.int32, [batch_size, 40])

data = tf.Variable(tf.zeros([batch_size, 40, 50]), dtype=tf.float32)
data = tf.nn.embedding_lookup(glove_embeddings_arr, input_data)

lstm_cell = tf.contrib.rnn.BasicLSTMCell(lstm_units)
lstm_cell = tf.contrib.rnn.DropoutWrapper(cell = lstm_cell, output_keep_prob = 0.75)
value,state = tf.nn.dynamic_rnn(lstm_cell, data, dtype=tf.float32)

weight = tf.Variable(tf.truncated_normal([lstm_units, classes]))
bias = tf.Variable(tf.constant(0.1, shape = [classes]))
value = tf.transpose(value, [1,0,2])
last = tf.gather(value, int(value.get_shape()[0]) - 1)
prediction = (tf.matmul(last, weight) + bias)



true_pred = tf.equal(tf.argmax(prediction, 1), tf.argmax(labels,1))
accuracy = tf.reduce_mean(tf.cast(true_pred,tf.float32))

loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=labels))
optimizer = tf.train.AdamOptimizer().minimize(loss)

The interpreter returns

ValueError: An initializer for variable rnn/basic_lstm_cell/kernel of <dtype: 'string'> is required

Can someone explain to me this error?

The problem is that you are (most probably) feeding raw input text to the network. This is not in your code snippet, but the error indicates <dtype: 'string'> :

ValueError: An initializer for variable rnn/basic_lstm_cell/kernel of <dtype: 'string'> is required

The type is deduced from the input that an LSTM cell gets. Inner LSTM variables ( kernel and bias ) are initialized with a default initializer, which (at least now) can deal only with floating and integer types , but fails for other types. In your case the type is a tf.string , that's why you see this error.

Now, what you should do is transform your input sentences into real vectors. The best way to do this is via word embedding , eg word2vec , but a simple word indexing is also possible. Take a look at this post , particularly at how they work with text data. There's also a complete working code example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM