简体   繁体   English

Keras / Tensorflow输入到RNN层

[英]Keras/Tensorflow Input to RNN layers

I'm trying to build a RNN in Keras. 我正在尝试在Keras中建立RNN。 I dont quite understand the required input format. 我不太了解所需的输入格式。 I can build dense networks no problem, but I think that the RNN layers expect input dimension x batch x time step? 我可以建立密集网络,没问题,但是我认为RNN层期望输入尺寸x批处理x时间步长? Can anyone verify this? 有人可以验证吗?

Here is the code I would like to update: 这是我要更新的代码:

Original code: 原始代码:

def get_generative(G_in, dense_dim=200, out_dim=50, lr=1e-3):
   x = Dense(dense_dim)(G_in)
   x = Activation('tanh')(x)
   G_out = Dense(out_dim, activation='tanh')(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=[10])
G, G_out = get_generative(G_in)
G.summary()

Modified with GRU layers and some slightly different dimensions: 修改了GRU图层和一些稍微不同的尺寸:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   clear_session()
   x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
   G_out = GRU(out_dim, return_state=True)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=(None,3))
G, G_out = get_generative(G_in)
G.summary()

The error that I am seeing with this code is: ValueError: Tensor("gru_1/strided_slice:0", shape=(3, 10), dtype=float32) must be from the same graph as Tensor("strided_slice_1:0", shape=(?, 3), dtype=float32). 我在此代码中看到的错误是:ValueError:Tensor(“ gru_1 / strided_slice:0”,shape =(3,10),dtype = float32)必须与Tensor(“ strided_slice_1:0”)来自同一图, shape =(?, 3),dtype = float32)。

If I remove the "None" above, I get:ValueError: Input 0 is incompatible with layer gru_1: expected ndim=3, found ndim=2 如果删除上面的“ None”,则会得到:ValueError:输入0与gru_1层不兼容:预期ndim = 3,找到ndim = 2

Any explanation would be helpful here. 任何解释在这里都会有所帮助。

You get an error because you clear the session after creating the input tensor. 因为创建输入张量后清除了会话,所以会出现错误。 That is why the input tensor is not coming from the same graph as the rest of your network. 这就是为什么输入张量与网络的其余部分来自不同的图的原因。 To fix this simply leave out the line clear_session() . 要解决此问题,只需省去clear_session()

Another problem with your code: the second GRU layer expects a sequence input, therefore you should use return_sequences=True inside the first GRU layer. 您的代码还有另一个问题:第二个GRU层需要一个序列输入,因此您应该在第一个GRU层中使用return_sequences=True You probably want to leave out the argument return_state=True since that makes the layer return a tuple of tensors (output and state) instead of just one output tensor. 您可能想要省略参数return_state=True因为这会使该层返回张量(输出和状态)的元组,而不仅仅是一个输出张量。

To sum up, the following code should do it: 总结起来,下面的代码应该做到这一点:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   x = GRU(dense_dim, activation='tanh', return_sequences=True)(G_in)
   G_out = GRU(out_dim)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

The problem here is that RNN layers expect a 3D tensor input of the form: [num samples, time steps, features]. 这里的问题是RNN层需要以下形式的3D张量输入:[数量样本​​,时间步长,特征]。

So we can modify the code above as: 因此,我们可以将上面的代码修改为:

def get_generative(G_in, dense_dim=10, out_dim=37, lr=1e-3):
   x = GRU(dense_dim, activation='tanh',return_state=True)(G_in)
   G_out = GRU(out_dim, return_state=True)(x)
   G = Model(G_in, G_out)
   opt = SGD(lr=lr)
   G.compile(loss='binary_crossentropy', optimizer=opt)
   return G, G_out

G_in = Input(shape=(1,3))
G, G_out = get_generative(G_in)
G.summary()

So what we are saying is that we expect an input of an arbitrary number of samples, each of 1 time step with 3 features. 因此,我们要说的是,我们希望输入任意数量的样本,每个样本具有1个时间步长和3个特征。

Anna is correct that clear_session() should not be inside the generator function. 安娜是正确的,clear_session()不应在生成器函数中。

Lastly, if you actually want to input data into the network, its shape should also match what we just discussed. 最后,如果您确实想将数据输入到网络中,则其形状也应与我们刚刚讨论的形状匹配。 You can do this by using numpy reshape: 您可以使用numpy reshape来做到这一点:

X = np.reshape(X, (X.shape[0], 1, X.shape[1]))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM