如何解决Keras中的“分段错误（核心转储”错误）

Question

I'm having issues with Keras. 我和Keras有问题。 Basically, it gives me the following error "Segmentation fault (core dumped)" when I try to fit a model with a conv2d layer. 基本上，当我尝试使用conv2d层拟合模型时，它给出了以下错误“Segmentation fault（core dumped）”。

My code works on the CPU. 我的代码适用于CPU。 It also works without any conv2d layers (even though it's ineffective for my use case). 它也可以在没有任何conv2d层的情况下工作（即使它对我的用例无效）。 I've got cuda, cudnn, and tensorflow installed. 我已经安装了cuda，cudnn和tensorflow。 I've tried reinstalling keras and tensorflow. 我已经尝试重新安装keras和tensorflow。

Code: 码：

def model_build():
    model = Sequential()
    model.add(Conv2D(input_shape = (env_size()[0], env_size()[1], 1), filters=4, kernel_size=(3,3), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Conv2D(filters=4, kernel_size=(5,5), strides=1, activation=swisher))
    model.add(Flatten())
    model.add(Dense(128, activation='softmax'))
    model.add(Dense(4, activation='softmax'))
    return model

if __name__ == '__main__':
    y = model_build()
    y.compile(loss = "mean_squared_error", optimizer = 'adam')
    y.fit(x=env(), y = np.array([[0,0,0,0]])

Error: 错误：

Using TensorFlow backend.
Epoch 1/1
2019-03-27 05:52:27.687323: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-03-27 05:52:27.789975: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:964] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-03-27 05:52:27.790819: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1411] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.83
pciBusID: 0000:01:00.0
totalMemory: 5.73GiB freeMemory: 5.40GiB
2019-03-27 05:52:27.790834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1490] Adding visible gpu devices: 0
2019-03-27 05:52:28.068080: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-03-27 05:52:28.068115: I tensorflow/core/common_runtime/gpu/gpu_device.cc:977]      0
2019-03-27 05:52:28.068121: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990] 0:   N
2019-03-27 05:52:28.068487: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1103] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 5147 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2060, pci bus id: 0000:01:00.0, compute capability: 7.5)
2019-03-27 05:52:28.177752: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.337277: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.500486: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.586280: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
2019-03-27 05:52:28.675738: W tensorflow/core/framework/allocator.cc:113] Allocation of 518619136 exceeds 10% of system memory.
Segmentation fault (core dumped)

EDIT: 编辑：

Self-contained example. 自足的例子。

import numpy as np
import keras

model = keras.models.Sequential() #Sequential model type.
model.add(keras.layers.Conv2D(filters=1, kernel_size=(3,3), strides = 1, activation="sigmoid")) #Convolutional layer.
model.add(keras.layers.Flatten()) #Flatten layer.
model.add(keras.layers.Dense(4)) #Dense layer of 4 units.
model.compile(loss='mean_squared_error', optimizer='adam') #compile model.
y = np.random.rand(1,4) #Random expected output
x = np.random.rand(1, 38, 21, 1) # Random input.
model.fit(x, y) #And fit...

EDIT2: EDIT2：

Keras version: 'v2.1.6-tf' Keras版本：'v2.1.6-tf'

Tensorflow-GPU version: 'v1.12' Tensorflow-GPU版本：'v1.12'

Python version: 'v3.5.2' Python版：'v3.5.2'

CUDA version: 'v9.0.176' CUDA版本：'v9.0.176'

CUDNN version: 'v7.2.1.38-1+cuda9.0 CUDNN版本：'v7.2.1.38-1 + cuda9.0

Ubuntu version: 'v16.04' Ubuntu版：'v16.04'

Answer 1

It seems that your GPU does not have enough memory. 您的GPU似乎没有足够的内存。 Your model does not seem to be too big, so I would guess that the problem comes from the line: 你的模型似乎不是太大，所以我猜这个问题来自这条线：

y.fit(x=env(), y = np.array([[0,0,0,0]])

The output of env() might be too big to be handle by your GPU memory. env()的输出可能太大而无法由GPU内存处理。

Answer 2

Your MWE works fine for me (if I add , input_shape=(38, 21, 1) to the first convolution layer): 你的MWE对我来说很好（如果我将, input_shape=(38, 21, 1)到第一个卷积层）：

import numpy as np
import keras

model = keras.models.Sequential() #Sequential model type.
model.add(keras.layers.Conv2D(filters=1, kernel_size=(3,3), strides = 1, activation="sigmoid", input_shape=(38, 21, 1))) #Convolutional layer.
model.add(keras.layers.Flatten()) #Flatten layer.
model.add(keras.layers.Dense(4)) #Dense layer of 4 units.
model.compile(loss='mean_squared_error', optimizer='adam') #compile model.
y = np.random.rand(2, 4) #Random expected output
x = np.random.rand(2, 38, 21, 1) # Random input.
model.fit(x, y)

That means that your issue must come from your system or installation. 这意味着您的问题必须来自您的系统或安装。

Looking at the compatibility chart of tensorflow shows that your python, tensorflow and CUDA versions should be compatible. 查看tensorflow的兼容性图表显示您的python，tensorflow和CUDA版本应该兼容。

For your configuration the cuDNN version 7.0.x is recommended. 为了您的配置cuDNN版本7.0.x建议。 The cuDNN version 7.2 that you are using is probably incompatible. 您使用的cuDNN版本7.2可能不兼容。 Try installing / using cuDNN 7.0.x . 尝试安装/使用cuDNN 7.0.x 。

如何解决Keras中的“分段错误（核心转储”错误）

问题描述

2 个解决方案

解决方案1
0 2019-03-27 11:06:46

解决方案2
0 已采纳 2019-03-30 17:44:28

如何解决Keras中的“分段错误（核心转储”错误）

问题描述

2 个解决方案

解决方案1 0 2019-03-27 11:06:46

解决方案2 0 已采纳 2019-03-30 17:44:28

解决方案1
0 2019-03-27 11:06:46

解决方案2
0 已采纳 2019-03-30 17:44:28