keras中的小型LSTM模型不适合我的GPU

Question

I am programming a relatively small LSTM model in Google Collab. 我正在用Google Collab编程一个相对较小的LSTM模型。

For reference I am using TensorFlow 1.13 to build the model, using tensorflow.keras for the keras API. 作为参考，我使用TensorFlow 1.13来构建模型，并使用tensorflow.keras作为keras API。

seq_len = 20000; n_classes = 4
inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1000)(inputs)
x = ll.LSTM(units=100, activation='relu', return_sequences=True)(x)
outputs = ll.Dense(units = n_classes, activation='softmax')(x)
model = Model(inputs, outputs)
model.summary()

I have checked that I have 15 GB of GPU RAM available, and according to my estimations the model with a batch size of 32 should fit in 3GB of RAM. 我检查了是否有15 GB的GPU RAM可用，根据我的估计，批处理大小为32的模型应该适合3GB的RAM。

However, whenever I launch the training the server runs out of memory. 但是，每当我启动培训时，服务器内存就会用完。

To be fair, I am using extremely long sequences of data (20000 is the maximum sequence length) but I would expect the model to unroll symbolically in memory and just fit in. 公平地说，我正在使用非常长的数据序列（最大序列长度为20000），但是我希望模型在内存中象征性地展开并适合。

Reducing the batch size to 1 does not help either. 将批次大小减小为1也不会有帮助。

What is going on? 到底是怎么回事？ How can I make this model fit in memory? 如何使该模型适合内存？

EDIT: I tried reducing the sequence length to 2 and that indeed makes it fit in memory. 编辑：我试图减少序列长度为2，这确实使其适合内存。 But I need the sequence length to remain high. 但是我需要序列长度保持较高。 How can I tell Tensorflow to not unroll the network at any point? 我怎样才能告诉Tensorflow在任何时候都不展开网络？ (I suspect that is what is going on behind the scenes, how can I check if this is indeed the case?) （我怀疑这是幕后情况，如何检查是否确实如此？）

EDIT: If I remove the Softmax layer then the memory use drops to the normal range again. 编辑：如果我删除Softmax层，然后内存使用率再次下降到正常范围。 I think that the Softmax layer is causing Tensorflow to unroll the network. 我认为Softmax层导致Tensorflow展开网络。 TimeDistributing the Softmax does not help though. 但是，时间分配Softmax并没有帮助。

Answer 1

Changing the LSTM layer for the CuDNNLSTM layer did the trick! 将LSTM层更改为CuDNNLSTM层就成功了！

inputs = ll.Input(shape=(seq_len,))
x = ll.Embedding(len(word_index), 1024)(inputs)
x = ll.CuDNNLSTM(units=100, return_sequences=True)(x)
x = ll.Dense(units = n_classes, activation='softmax')(x)
outputs = x
model = Model(inputs, outputs)

keras中的小型LSTM模型不适合我的GPU

问题描述

1 个解决方案

解决方案1
1 2019-04-26 09:06:10

keras中的小型LSTM模型不适合我的GPU

问题描述

1 个解决方案

解决方案1 1 2019-04-26 09:06:10

解决方案1
1 2019-04-26 09:06:10