进程以退出代码 -1073741571 (0xC00000FD) Tensorflow 结束

Question

我知道这个问题被问了很多，但就我而言，这有点奇怪。 我刚拿到 RTX 3080 并尝试根据我在reddit上找到的教程安装 Tensorflow 。 I did everything as described there: Install Anaconda --> Python 3.8 --> TF-nightly v. 2.5.0 --> Visual Studio C++ --> Cuda 11.1.0 --> cuDNN 8.0.4 --> add path --> 重启电脑。 起初一切似乎都有效。 我尝试了以下命令：

import tensorflow as tf
tf.config.list_physical_devices()

正如您在 output 中看到的那样，这可以正常工作：

C:\Users\loose\.conda\envs\tf2\python.exe C:/Users/loose/PycharmProjects/GenerateAutomatedEMail/python/test.py
2021-01-16 00:40:45.043205: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-01-16 00:40:46.676446: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-01-16 00:40:46.699117: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties: 
pciBusID: 0000:2d:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.785GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2021-01-16 00:40:46.699285: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-01-16 00:40:46.713523: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-01-16 00:40:46.713626: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-01-16 00:40:46.717017: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-01-16 00:40:46.718013: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-01-16 00:40:46.725508: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-01-16 00:40:46.728010: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-01-16 00:40:46.728534: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-01-16 00:40:46.728660: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0

Process finished with exit code 0

我目前尝试从TF 教程中训练 Seq2Seq model。 代码几乎完全相同，但我使用 PyCharm 而不是 Jupyter，我将所有内容都放在 class 但代码本身是相同的。 我的完整代码在GitHub中可用。 当我想训练 model 时，我收到错误"Process finished with exit code -1073741571 (0xC00000FD)" 。 但是没有真正的错误显示程序刚刚以这个退出代码结束：

C:\Users\loose\.conda\envs\tf2\python.exe C:/Users/loose/PycharmProjects/GenerateAutomatedEMail/python/train_model.py
2021-01-16 00:50:34.337791: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-01-16 00:50:36.873698: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library nvcuda.dll
2021-01-16 00:50:36.894834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties: 
pciBusID: 0000:2d:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.785GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2021-01-16 00:50:36.895004: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudart64_110.dll
2021-01-16 00:50:36.909453: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-01-16 00:50:36.909542: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-01-16 00:50:36.912954: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cufft64_10.dll
2021-01-16 00:50:36.914024: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library curand64_10.dll
2021-01-16 00:50:36.921476: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusolver64_10.dll
2021-01-16 00:50:36.924059: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cusparse64_11.dll
2021-01-16 00:50:36.924660: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-01-16 00:50:36.924807: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-01-16 00:50:36.925280: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-01-16 00:50:36.926213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1760] Found device 0 with properties: 
pciBusID: 0000:2d:00.0 name: GeForce RTX 3080 computeCapability: 8.6
coreClock: 1.785GHz coreCount: 68 deviceMemorySize: 10.00GiB deviceMemoryBandwidth: 707.88GiB/s
2021-01-16 00:50:36.926418: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1898] Adding visible gpu devices: 0
2021-01-16 00:50:37.388811: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1300] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-01-16 00:50:37.388901: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1306]      0 
2021-01-16 00:50:37.388947: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1319] 0:   N 
2021-01-16 00:50:37.389134: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1446] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7447 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3080, pci bus id: 0000:2d:00.0, compute capability: 8.6)
2021-01-16 00:50:38.006971: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cudnn64_8.dll
2021-01-16 00:50:38.586194: I tensorflow/stream_executor/cuda/cuda_dnn.cc:334] Loaded cuDNN version 8004
2021-01-16 00:50:38.709516: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-01-16 00:50:39.312210: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
2021-01-16 00:50:39.313013: I tensorflow/stream_executor/cuda/cuda_bl

as.cc:1838] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.

Process finished with exit code -1073741571 (0xC00000FD)

所以我试图在程序崩溃时找到该行。 我发现它在初始化“BahdanauAttention”class 后立即崩溃，如图所示。

经过几个小时的测试，我可以假设/确认一些事情：

我可以在这个venv中运行正常（非张量流）代码而没有这个错误
我没有用完内存（最多只使用 17GB 的 32GB 内存）
我没有打开任何可能导致冲突的程序（例如 NVIDIA Broadcast 或 Jupyter Lab 等）

我测试的东西来解决这个问题：

重新安装康达
创建新的venv
重新安装 TF 以及所有 NVIVIDA 驱动程序
尝试不同的 Python 版本（3.7 而不是 3.8）
重启我的电脑

在这一点上，我有点没有选择。 有谁知道如何解决这个问题？

Answer 1

您可以将Tensorflow升级到最新的稳定版本，因为Tensorflow 2.4版本支持新Nvidia's Ampere架构，它属于RTX 30系列， CUDA 11也支持。
您可以在此图表中查看详细信息并按照指南进行安装。
https://www.tensorflow.org/install/source_windows#tested_build_configurations

关于 GPU 上的 memory 使用，您始终可以在代码开头设置 memory 增长，如此处所述。

进程以退出代码 -1073741571 (0xC00000FD) Tensorflow 结束

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-02-03 08:21:55

进程以退出代码 -1073741571 (0xC00000FD) Tensorflow 结束

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-02-03 08:21:55

解决方案1
1 已采纳 2021-02-03 08:21:55