简体   繁体   中英

keras bidirectional lstm seq2seq

I am trying to modify the lstm_seq2seq.py example of keras, to modify it to a bidirectional lstm model.

https://github.com/keras-team/keras/blob/master/examples/lstm_seq2seq.py

I try different approaches:

  • the first one was to directly apply the Bidirectional wraper to the LSTM layer:

     encoder_inputs = Input(shape=(None, num_encoder_tokens)) encoder = Bidirectional(LSTM(latent_dim, return_state=True)) 

but I got this error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-76-a80f8554ab09> in <module>()
     75 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
     76 
---> 77 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
     78 # We discard `encoder_outputs` and only keep the states.
     79 encoder_states = [state_h, state_c]

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
    601 
    602             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603             output = self.call(inputs, **kwargs)
    604             output_mask = self.compute_mask(inputs, previous_mask)
    605 

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
    293             y_rev = K.reverse(y_rev, 1)
    294         if self.merge_mode == 'concat':
--> 295             output = K.concatenate([y, y_rev])
    296         elif self.merge_mode == 'sum':
    297             output = y + y_rev

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
   1757     """
   1758     if axis < 0:
-> 1759         rank = ndim(tensors[0])
   1760         if rank:
   1761             axis %= rank

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
    597     ```
    598     """
--> 599     dims = x.get_shape()._dims
    600     if dims is not None:
    601         return len(dims)

AttributeError: 'list' object has no attribute 'get_shape'
  • my second guess was to modify the input to have something like in https://github.com/keras-team/keras/blob/master/examples/imdb_bidirectional_lstm.py :

     encoder_input_data = np.empty(len(input_texts), dtype=object) decoder_input_data = np.empty(len(input_texts), dtype=object) decoder_target_data = np.empty(len(input_texts), dtype=object) for i, (input_text, target_text) in enumerate(zip(input_texts, target_texts)): encoder_input_data[i] = [input_token_index[char] for char in input_text] tseq = [target_token_index[char] for char in target_text] decoder_input_data[i] = tseq decoder_output_data[i] = tseq[1:] encoder_input_data = sequence.pad_sequences(encoder_input_data, maxlen=max_encoder_seq_length) decoder_input_data = sequence.pad_sequences(decoder_input_data, maxlen=max_decoder_seq_length) decoder_target_data = sequence.pad_sequences(decoder_target_data, maxlen=max_decoder_seq_length) 

but I got the same error message:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-75-474b2515be72> in <module>()
     73 encoder = Bidirectional(LSTM(latent_dim, return_state=True))
     74 
---> 75 encoder_outputs, state_h, state_c = encoder(encoder_inputs)
     76 # We discard `encoder_outputs` and only keep the states.
     77 encoder_states = [state_h, state_c]

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/engine/topology.py in __call__(self, inputs, **kwargs)
    601 
    602             # Actually call the layer, collecting output(s), mask(s), and shape(s).
--> 603             output = self.call(inputs, **kwargs)
    604             output_mask = self.compute_mask(inputs, previous_mask)
    605 

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/layers/wrappers.py in call(self, inputs, training, mask)
    293             y_rev = K.reverse(y_rev, 1)
    294         if self.merge_mode == 'concat':
--> 295             output = K.concatenate([y, y_rev])
    296         elif self.merge_mode == 'sum':
    297             output = y + y_rev

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in concatenate(tensors, axis)
   1757     """
   1758     if axis < 0:
-> 1759         rank = ndim(tensors[0])
   1760         if rank:
   1761             axis %= rank

/home/tristanbf/.virtualenvs/pydev3/lib/python3.5/site-packages/keras/backend/tensorflow_backend.py in ndim(x)
    597     ```
    598     """
--> 599     dims = x.get_shape()._dims
    600     if dims is not None:
    601         return len(dims)

AttributeError: 'list' object has no attribute 'get_shape'

Any help? Thanks

(The code: https://gist.github.com/anonymous/c0fd6541ab4fc9c2c1e0b86175fb65c7 )

The error you're seeing is because the Bidirectional wrapper does not handle the state tensors properly. I've fixed it in this PR , and it's in the latest 2.1.3 release already. So the lines in the question should work now if you upgrade your Keras to the latest version.

Note that the returned value from Bidirectional(LSTM(..., return_state=True)) is a list containing:

  1. Layer output
  2. States (h, c) of the forward layer
  3. States (h, c) of the backward layer

So you may need to merge the state tensors before passing them to the decoder (which is usually unidirectional, I suppose). For example, if you choose to concatenate the states,

encoder_inputs = Input(shape=(None, num_encoder_tokens))
encoder = Bidirectional(LSTM(latent_dim, return_state=True))
encoder_outputs, forward_h, forward_c, backward_h, backward_c = encoder(encoder_inputs)

state_h = Concatenate()([forward_h, backward_h])
state_c = Concatenate()([forward_c, backward_c])
encoder_states = [state_h, state_c]

decoder_inputs = Input(shape=(None, num_decoder_tokens))
decoder_lstm = LSTM(latent_dim * 2, return_sequences=True, return_state=True)
decoder_outputs, _, _ = decoder_lstm(decoder_inputs, initial_state=encoder_states)

If the issue is related to the data preparing process, it's conceptually similar to this one where a simple list have not the shape attribute usually added by Numpy.

Also, you should feed your input to the LSTM encoder or simply set the input_shape value to the LSTM layer. Always use return_sequences=True within the LSTM layer when feeding a Bidirectional one. Check this thread to understand how to use them correctly and then check out some lines of code i wrote for a NLP project (i used Bidirectional layer too) on my GitHub

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM