简体   繁体   中英

InternalError (see above for traceback): Blas GEMM launch failed

I trained a model with keras and want to evaluate it. But I always get this error. I found a solution here TensorFlow: InternalError: Blas SGEMM launch failed but this is just for tensorflow.

Using TensorFlow backend.
2017-11-01 10:40:49.120525: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-01 10:40:49.120546: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-01 10:40:49.120553: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-01 10:40:49.120557: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-01 10:40:49.120562: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-11-01 10:40:49.266103: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-11-01 10:40:49.266511: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce 940MX
major: 5 minor: 0 memoryClockRate (GHz) 1.189
pciBusID 0000:01:00.0
Total memory: 1.96GiB
Free memory: 1.78GiB
2017-11-01 10:40:49.266528: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2017-11-01 10:40:49.266534: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2017-11-01 10:40:49.266542: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0)
x_train shape: (50000, 32, 32, 3)
50000 train samples
10000 test samples
2017-11-01 10:40:54.162805: E tensorflow/stream_executor/cuda/cuda_blas.cc:366] failed to create cublas handle: CUBLAS_STATUS_NOT_INITIALIZED
2017-11-01 10:40:54.162825: W tensorflow/stream_executor/stream.cc:1756] attempting to perform BLAS operation using StreamExecutor without BLAS support
Traceback (most recent call last):
  File "/home/viktor/PycharmProjects/ProjectSSD/test.py", line 39, in <module>
    scores = model.evaluate(x_test_bin, y_test, verbose=1)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/models.py", line 896, in evaluate
    sample_weight=sample_weight)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1657, in evaluate
    steps=steps)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/engine/training.py", line 1339, in _test_loop
    batch_outs = f(ins_batch)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 2273, in __call__
    **self.session_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 895, in run
    run_metadata_ptr)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1124, in _run
    feed_dict_tensor, options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1321, in _do_run
    options, run_metadata)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.py", line 1340, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: Blas GEMM launch failed : a.shape=(32, 2304), b.shape=(2304, 512), m=32, n=512, k=2304
     [[Node: dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](flatten_1/Reshape, dense_1/kernel/read)]]

Caused by op u'dense_1/MatMul', defined at:
  File "/home/viktor/PycharmProjects/ProjectSSD/test.py", line 13, in <module>
    model = load_model(save_dir + '/' + model_name)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/models.py", line 239, in load_model
    model = model_from_config(model_config, custom_objects=custom_objects)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/models.py", line 313, in model_from_config
    return layer_module.deserialize(config, custom_objects=custom_objects)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/layers/__init__.py", line 54, in deserialize
    printable_module_name='layer')
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/utils/generic_utils.py", line 139, in deserialize_keras_object
    list(custom_objects.items())))
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/models.py", line 1214, in from_config
    model.add(layer)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/models.py", line 475, in add
    output_tensor = layer(self.outputs[0])
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 602, in __call__
    output = self.call(inputs, **kwargs)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/layers/core.py", line 841, in call
    output = K.dot(inputs, self.kernel)
  File "/home/viktor/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 998, in dot
    out = tf.matmul(x, y)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/math_ops.py", line 1844, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_math_ops.py", line 1289, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InternalError (see above for traceback): Blas GEMM launch failed : a.shape=(32, 2304), b.shape=(2304, 512), m=32, n=512, k=2304
     [[Node: dense_1/MatMul = MatMul[T=DT_FLOAT, transpose_a=false, transpose_b=false, _device="/job:localhost/replica:0/task:0/gpu:0"](flatten_1/Reshape, dense_1/kernel/read)]]

Here is the code: First loading the model. Loading the dataset. preparing the data for the evaluation. And then evaluate.

from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.models import load_model
import numpy as np

import os

num_classes = 10
save_dir = os.path.join(os.getcwd(), 'examples/saved_models')
model_name = 'keras_cifar10_trained_model.h5'
model = load_model(save_dir + '/' + model_name)

# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

x_train_float = x_train.astype('float32')
x_test_float = x_test.astype('float32')
x_train_bin = x_train_float / 255
x_test_bin = x_test_float / 255

# Score trained model.
scores = model.evaluate(x_test_bin, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])

I found that people have the same problem, if they run two sessions at the same time. When the line

model = load_model(save_dir + '/' + model_name)

was passed, the GPU usage increases. (watch -n 0.5 nvidia-smi) Maybe this is the problem?

Anthony D'amato, sorry I wasted your time.

The erros comes from a part in the code, that has something to do with cv2. I opened a new Question.

cv2, keras, InternalError (see above for traceback): Blas GEMM launch failed

Thank you very much. you helped me to get closer to the solution

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM