简体   繁体   中英

Sequence to sequence modeling in python

I am trying to make a chatbot that uses a sequence to sequence model to respond to the user's input. The problem is that the input sequence given to the model will almost never be the same. The input sequence is a list of words. I have created a vocabulary that maps each word in this sequence to its own unique id, however, the input is still variable and is not fixed so I can't just use a sequence to sequence model. I understand that it is possible to use an encoder to map the sequence of words to a fixed vector representation and then have a decoder map that vector back to a sequence.

The question I have is how would I go about encoding the sequence of words to a fixed vector? Is there any sort of technique that could be used for this purpose?

Mapping a sequence of words to a vector representation can be accomplished with Recurrent Neural Network. You can take a look at this introduction: http://colah.github.io/posts/2015-08-Understanding-LSTMs/

There is a tutorial in tensorflow tool kit that address this sequence to sequence mapping architecture with example code: https://www.tensorflow.org/versions/r0.11/tutorials/index.html

Before working with RNN, however, I would recommend going through the basics for neural networks: http://deeplearning.net/software/theano/tutorial/#basics

Bengio's deep learning book: http://www.deeplearningbook.org/ covers a lot of materials about RNN, however it involves quite a bit of math.

You should pad your data to a fixed length before passing it to the sequence modeler to make it fixed length.

from keras.preprocessing.sequence import pad_sequence
X = pad_sequences(X, maxlen=100, dtype='float32', padding='pre', truncating='pre', value=0.0)

so if you have a data-point with [m(number of samples), seq_len(variable sequence length), f(number of feature)] ... this will pad the data with a static seq_len of 100. It will truncate larger sequences and pad smaller ones with zeros at the start. There are other more sophisticated techniques for standardizing sequence length but this is an easy one.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM