简体   繁体   中英

How to implement a One to Many RNN in FluxML (Julia Lang)?

There is a wide set of examples for how to create various RNN architectures in Python with TensorFlow and Pytorch, and that includes the 1-to-many architecture. The question is how this can be done in FluxML with Julia Lang. With Keras in TensorFlow the return_sequences option to the RNN cell allows the states to be propagated, but from the documentation for FluxML https://fluxml.ai/Flux.jl/stable/models/recurrence/ , this does not seem to be implemented.

  • How should such an architecture be implemented in Flux ML?
  • When using an RNN unit in Chain such as Chain(rnn1,rnn2,rnn3) is Chain passing the output vectors (y) into the inputs of the following rnn unit or the hidden state (or both)?
  • How can the RNN or the Flux.RNNCell be used in different contexts and within the context of a Chain or model?

Assuming that the goal is to have a single input set at step 1 produce 2 y_hat outputs, so that the 1-to-many is a 1-to-2 recurrence.network.

一对多

So the model input X dimension must have as many dimensions as the y_hat outputs so that the outputs at each cell become the new x inputs at the following subsequent step (hidden component is not altered directly). A possible model is (notice input dims and output dims are equal)

rnn_model = Chain( LSTM(feature_length=>12) , Dense( 12 => feature_length, sigmoid) , softmax )

In the training loop the gradient and loss can be found by aggregating the loss from each step where the unit is trying to predict the sequence step outcome. The key is that for 1-to-many the y_hat has to be directed as the subsequent x input in the following step so that y_hat1 becomes x_2. Here is a small example where the x_batch data is the first input data from a set of independent samples and y_tensor the target data for 2 steps indexed by the 3rd dim

Flux.reset!( rnn_model )
        loss_tmp, grads = Flux.withgradient( rnn_model ) do model
            loss = 0            
            y_hat1 = rnn_model( x_batch ) 
            loss += Flux.crossentropy( y_hat1 ,  y_tensor[:,:,1] )
            
            y_hat2 = rnn_model( y_hat1 ) 
            loss += Flux.crossentropy( y_hat2 ,  y_tensor[:,:,2] )
            return loss
        end

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM