简体   繁体   中英

GridSearchCV training on a keras LSTM model is “killed” without a clear reason

I am stumped by a super strange issue. I am trying to train a simple LSTM model with sklearn classifier and GridSearchCV. With gridsearch on multiple jobs, the code would hang without any output; with a single job, the process would be killed with the above output:

2018-02-17 18:15:02.733824: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX
234/234 [==============================] - 0s 1ms/step
935/935 [==============================] - 1s 626us/step
234/234 [==============================] - 0s 2ms/step
935/935 [==============================] - 1s 684us/step
234/234 [==============================] - 1s 2ms/step
935/935 [==============================] - 1s 684us/step
234/234 [==============================] - 1s 2ms/step
935/935 [==============================] - 1s 547us/step
...
...
234/234 [==============================] - 4s 16ms/step
935/935 [==============================] - 1s 1000us/step
Killed

Does anyone know what's killing the GridSearchCV ?

Your python process is being killed by the Linux Kernel's OOM killer, as the system is out of memory and python is requesting more.

Since you are doing cross-validation, I assume you are also using TensorFlow as backend, and in that case it is possible that its a bug in Keras/TF as the session isn't cleared. More information in https://github.com/keras-team/keras/issues/2102

A quick solution would be to use keras.backend.clear_session after each CV iteration. If you are not using the TF backend, then its probably a bug in your own code.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM