简体   繁体   中英

Keras/Tensorflow Input to RNN layers

I'm trying to build a RNN in Keras. I dont quite understand the required input format. I can build dense networks no problem, but I think that the RNN layers expect input dimension x batch x time step? Can anyone verify this?

Here is the code I would like to update:

Original code:

def get_generative(G_in, dense_dim=200, out_dim=50, lr=1e-3):
   x = Dense(dense_dim)(G_in)
   x = Activation('tanh')(x)
   G_out = Dense(out_dim, activation='tanh')(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=[10])
G, G_out = get_generative(G_in)
G.summary()

Modified with GRU layers and some slightly different dimensions:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   clear_session()
   x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
   G_out = GRU(out_dim, return_state=True)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=(None,3))
G, G_out = get_generative(G_in)
G.summary()

The error that I am seeing with this code is: ValueError: Tensor("gru_1/strided_slice:0", shape=(3, 10), dtype=float32) must be from the same graph as Tensor("strided_slice_1:0", shape=(?, 3), dtype=float32).

If I remove the "None" above, I get:ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=2

Any explanation would be helpful here.

You get an error because you clear the session after creating the input tensor. That is why the input tensor is not coming from the same graph as the rest of your network. To fix this simply leave out the line clear_session() .

Another problem with your code: the second GRU layer expects a sequence input, therefore you should use return_sequences=True inside the first GRU layer. You probably want to leave out the argument return_state=True since that makes the layer return a tuple of tensors (output and state) instead of just one output tensor.

To sum up, the following code should do it:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   x = GRU(dense_dim, activation='tanh', return_sequences=True)(G_in)
   G_out = GRU(out_dim)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

The problem here is that RNN layers expect a 3D tensor input of the form: [num samples, time steps, features].

So we can modify the code above as:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
   G_out = GRU(out_dim, return_state=True)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=(1,3))
G, G_out = get_generative(G_in)
G.summary()

So what we are saying is that we expect an input of an arbitrary number of samples, each of 1 time step with 3 features.

Anna is correct that clear_session() should not be inside the generator function.

Lastly, if you actually want to input data into the network, its shape should also match what we just discussed. You can do this by using numpy reshape:

X = np.reshape(X, (X.shape[0], 1, X.shape[1]))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM