[英]Is automatic mixed precision supported by tf.keras in Tensorflow Beta 2.0?
I am trying to get Tensorflow's automatic mixed precision working (to use the tensor cores on an RTX 2080 Ti), using the tf.keras API, but I can't see any speed-up in training.我正在尝试使用 tf.keras API 使 Tensorflow 的自动混合精度工作(在 RTX 2080 Ti 上使用张量核心),但我看不到任何训练加速。
I have just added我刚刚添加了
os.environ['TF_ENABLE_AUTO_MIXED_PRECISION'] = '1'
to the top of the Python script.到 Python 脚本的顶部。 I also tried setting the environment variable to 1 from the command line, ie我还尝试从命令行将环境变量设置为 1,即
export TF_ENABLE_AUTO_MIXED_PRECISION=1
Is AMP supported in this case, or does the model need to be implemented in 'raw' Tensorflow?在这种情况下是否支持 AMP,或者模型是否需要在“原始”Tensorflow 中实现?
At the moment, automatic mixed precision is only supported when using the Tensorflow Docker container from NVIDIA:目前,仅当使用 NVIDIA 的 Tensorflow Docker 容器时才支持自动混合精度:
https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow https://ngc.nvidia.com/catalog/containers/nvidia:tensorflow
https://www.tensorflow.org/install/docker https://www.tensorflow.org/install/docker
You need to use Ubuntu 18.04, the current Ubuntu version and Windows are not supported.您需要使用 Ubuntu 18.04,不支持当前的 Ubuntu 版本和 Windows。 The last Docker container has TF 1.13 if I'm not mistaken.如果我没记错的话,最后一个 Docker 容器有 TF 1.13。 Once installed, tf.keras should support automatic mixed precision.安装后,tf.keras 应该支持自动混合精度。
Edit:编辑:
I tried the 2.0.0-beta1 on Windows and also did not notice any speed improvement when using automatic mixed precision.我在 Windows 上尝试了 2.0.0-beta1 并且在使用自动混合精度时也没有注意到任何速度改进。 With the NVIDIA Docker container on Linux, I got at least 2x speedup when setting TF_ENABLE_AUTO_MIXED_PRECISION to 1. Hopefully, this will work in the 2.0 release.使用 Linux 上的 NVIDIA Docker 容器,将 TF_ENABLE_AUTO_MIXED_PRECISION 设置为 1 时,我至少获得了 2 倍的加速。希望这将在 2.0 版本中起作用。
Edit 2: With TF 2.0.0-rc0, AMP improves the performance as expected for a simple model.编辑 2:使用 TF 2.0.0-rc0,AMP 提高了简单模型的性能。 With a more complex model (a U-Net variant), no whitelist ops are found and I see no performance difference.使用更复杂的模型(U-Net 变体),找不到白名单操作,我看不到性能差异。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.