tensorflow.python.framework.errors_impl.InternalError: GPU 同步失败

Question

I have following installed:我安装了以下内容：

Window 10 Window 10
Python 3.8 Python 3.8
Tensorflow-gpu 2.3张量流-GPU 2.3
Cuda 10.1 Cuda 10.1
CudNN 7.6.5 CudNN 7.6.5
Nvidia gtx 1080英伟达 GTX 1080
Driver Version: 451.48驱动程序版本：451.48
Memory: 8192MiB Memory：8192MiB

During the training it gives following error:在训练过程中出现以下错误：

Traceback (most recent call last):
 File "training.py", line 519, in <module>
   history = model.fit(X_train, y_train, epochs=n_epochs, batch_size=batch_size, \
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\engine\training.py", line 108, in _method_wrapper
   return method(self, *args, **kwargs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\engine\training.py", line 1103, in fit
  callbacks.on_train_batch_end(end_step, logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 440, in on_train_batch_end
  self._call_batch_hook(ModeKeys.TRAIN, 'end', batch, logs=logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 289, in _call_batch_hook
  self._call_batch_end_hook(mode, batch, logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 309, in _call_batch_end_hook
  self._call_batch_hook_helper(hook_name, batch, logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 342, in _call_batch_hook_helper
  hook(batch, logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 961, in on_train_batch_end
   self._batch_update_progbar(batch, logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\callbacks.py", line 1016, in _batch_update_progbar
   logs = tf_utils.to_numpy_or_python_type(logs)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\utils\tf_utils.py", line 537, in to_numpy_or_python_type
  return nest.map_structure(_to_single_numpy_or_python_type, tensors)
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\util\nest.py", line 635, in map_structure
  structure[0], [func(*x) for x in entries],
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\util\nest.py", line 635, in <listcomp>
  structure[0], [func(*x) for x in entries],
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\keras\utils\tf_utils.py", line 533, in _to_single_numpy_or_python_type
   x = t.numpy()
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\framework\ops.py", line 1063, in numpy
  maybe_arr = self._numpy()  # pylint: disable=protected-access
 File "C:\Anaconda3_64\lib\site-packages\tensorflow\python\framework\ops.py", line 1031, in _numpy
  six.raise_from(core._status_to_exception(e.code, e.message), None)  # pylint: disable=protected-access
 File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InternalError: GPU sync failed

InternalError: GPU sync failed内部错误：GPU 同步失败

Any leads?有线索吗？

Answer 1

Please verify the paths for CUDA and CUPTI are set properly as below for enabling GPU support in your system.请验证CUDA和CUPTI的路径是否正确设置如下，以便在您的系统中启用GPU support 。

SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\bin;%PATH%
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\extras\CUPTI\lib64;%PATH%
SET PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.0\include;%PATH%
SET PATH=C:\tools\cuda\bin;%PATH%

Sometimes the GPU sync failed error occurs due to high usage of GPU by other applications or processing large input data.有时， GPU sync failed错误是由于其他应用程序对GPU的高使用率或处理大量输入数据而导致的。 So you should stop those applications or notebooks and try again executing your code.因此，您应该停止这些应用程序或笔记本并再次尝试执行您的代码。

tensorflow.python.framework.errors_impl.InternalError: GPU 同步失败

问题描述

1 个解决方案

解决方案1
0 2022-04-13 17:37:15

tensorflow.python.framework.errors_impl.InternalError: GPU 同步失败

问题描述

1 个解决方案

解决方案1 0 2022-04-13 17:37:15

解决方案1
0 2022-04-13 17:37:15