繁体   English   中英

Tensorflow 与 CUDA 支持一起运行时崩溃

[英]Tensorflow crashes when running with CUDA support

我安装了 tensorflow 2.4.1,一切正常,但我在尝试使用 Tensorboard 时遇到了崩溃的问题。 我查看了 github 问题页面,其中一位维护人员说这是一个已知问题,将在 2.5 中修复。

所以我去安装了 2.5-rc,然后一切都崩溃了。 然后我尝试降级回 2.4.1,但问题仍然存在。 没有其他尝试导致修复崩溃。

我一路删除 Anaconda 安装,所有 Python 源文件夹,CUDA 和 CuDNN 安装,然后重新安装。

根据 TF 帮助页面,我使用 CUDA 11.0 和 CuDNN 8.0 安装了 TF-2.4.1。 在我安装 CUDA 之前它确实有效。 现在,即使我手动隐藏启用 CUDA 的设备,它每次都会崩溃。 这是我得到的 output:

2021-05-06 21:52:41.777148: I tensorflow/compiler/jit/xla_cpu_device.cc:41] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-05-06 21:52:41.777782: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-05-06 21:52:41.808730: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.402GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2021-05-06 21:52:41.809006: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-05-06 21:52:41.812511: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-05-06 21:52:41.812659: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-05-06 21:52:41.814721: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-05-06 21:52:41.815428: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-05-06 21:52:41.819782: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-05-06 21:52:41.821370: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-05-06 21:52:41.822198: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-05-06 21:52:41.822393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-05-06 21:52:41.822962: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-05-06 21:52:41.882108: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1720] Found device 0 with properties: 
pciBusID: 0000:01:00.0 name: GeForce RTX 3060 Laptop GPU computeCapability: 8.6
coreClock: 1.402GHz coreCount: 30 deviceMemorySize: 6.00GiB deviceMemoryBandwidth: 268.26GiB/s
2021-05-06 21:52:41.882425: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-05-06 21:52:41.882571: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-05-06 21:52:41.882712: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-05-06 21:52:41.882855: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-05-06 21:52:41.883245: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-05-06 21:52:41.883438: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-05-06 21:52:41.883648: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-05-06 21:52:41.883842: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-05-06 21:52:41.884084: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1862] Adding visible gpu devices: 0
2021-05-06 21:52:42.347207: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1261] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-05-06 21:52:42.347408: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1267]      0 
2021-05-06 21:52:42.347499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1280] 0:   N 
2021-05-06 21:52:42.347777: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1406] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 4733 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3060 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6)
2021-05-06 21:52:42.359614: I tensorflow/compiler/jit/xla_gpu_device.cc:99] Not creating XLA devices, tf_xla_enable_xla_devices not set
2021-05-06 21:52:42.777774: I tensorflow/core/profiler/lib/profiler_session.cc:136] Profiler session initializing.
2021-05-06 21:52:42.777956: I tensorflow/core/profiler/lib/profiler_session.cc:155] Profiler session started.
2021-05-06 21:52:42.778158: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1365] Profiler found 1 GPUs
2021-05-06 21:52:42.780165: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cupti64_110.dll
2021-05-06 21:52:42.846459: I tensorflow/core/profiler/lib/profiler_session.cc:172] Profiler session tear down.
2021-05-06 21:52:42.846657: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1487] CUPTI activity buffer flushed
Epoch 1/20
2021-05-06 21:52:43.007415: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:116] None of the MLIR optimization passes are enabled (registered 2)
2021-05-06 21:52:43.367381: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-05-06 21:52:43.921448: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-05-06 21:52:43.927001: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll

Process finished with exit code -1073740791 (0xC0000409)

任何人都知道可能是什么问题?

几乎每个版本的 Tensorflow 都有其特定版本的 CUDA 和 CUDNN。 If you have Tensorflow 2.5, you need CUDA 11.2 and CUDNN 8.1 https://spltech.co.uk/how-to-install-tensorflow-2-5-with-cuda-11-2-and-cudnn-8-1 -for-windows-10/

我以前有这个问题,并且有很多问题导致这个错误代码。

我建议安装 TensorFlow,按照 GPU 部分中存在的版本在此处输入链接描述

您可以观看这些视频在此处输入链接描述

注意:在重新安装之前,请尝试

import tensorflow as tf

physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM