[英]RTX2080 not fully used by Tensorflow 2.0
我目前正在使用 tensorflow 2.0 测试我的 RTX2080,但我的 RTX2080 与我笔记本电脑中的 GTX1050Ti 一样快。 这是我当前的代码: https://github.com/clementpoiret/IBM-Capstone-CNN/blob/master/capstone.py
我听说将allow_growth
设置为 true,我已经完成并解决了 CuDNN 问题。 问题是 GPU 似乎已锁定。
nvidia-smi 在训练期间总是返回相似的值,python 似乎卡在使用 2655Mb:
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.26 Driver Version: 440.26 CUDA Version: 10.2 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 2080 Off | 00000000:2D:00.0 Off | N/A |
| 0% 55C P2 60W / 265W | 2912MiB / 7979MiB | 24% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 1026 G /usr/lib/Xorg 200MiB |
| 0 6420 G cinnamon 43MiB |
| 0 17187 C /home/clementpoiret/anaconda3/bin/python 2655MiB |
+-----------------------------------------------------------------------------+
这是我用来设置 gpu 的代码:
def setup_gpus():
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
# Currently, memory growth needs to be the same across GPUs
for gpu in gpus:
tf.config.experimental.set_memory_growth(gpu, True)
logical_gpus = tf.config.experimental.list_logical_devices(
'GPU')
print(len(gpus), "Physical GPUs,", len(logical_gpus),
"Logical GPUs")
except RuntimeError as e:
# Memory growth must be set before GPUs have been initialized
print(e)
我目前的配置是:
System: Host: Workstation Kernel: 5.3.8-3-MANJARO x86_64 bits: 64 compiler: gcc v: 9.2.0 Console: tty 0 dm: LightDM 1.30.0
Distro: Manjaro Linux
Machine: Type: Desktop Mobo: Micro-Star model: MPG X570 GAMING EDGE WIFI (MS-7C37) v: 1.0 serial: <filter>
UEFI: American Megatrends v: 1.50 date: 10/29/2019
CPU: Topology: 12-Core model: AMD Ryzen 9 3900X bits: 64 type: MT MCP arch: Zen L2 cache: 6144 KiB
flags: avx avx2 lm nx pae sse sse2 sse3 sse4_1 sse4_2 sse4a ssse3 svm bogomips: 182474
Speed: 4155 MHz min/max: 2200/3800 MHz boost: enabled Core speeds (MHz): 1: 3839 2: 2170 3: 3692 4: 2111 5: 3901
6: 2094 7: 2069 8: 2176 9: 2177 10: 4361 11: 2191 12: 2200 13: 2169 14: 2200 15: 2179 16: 3821 17: 2183 18: 2190
19: 2179 20: 4356 21: 2198 22: 2201 23: 2197 24: 2109
Graphics: Device-1: NVIDIA TU104 [GeForce RTX 2080 Rev. A] vendor: Gigabyte driver: nvidia v: 440.26 bus ID: 2d:00.0
chip ID: 10de:1e87
Display: server: X.org 1.20.5 driver: nvidia tty: 191x17
Message: Unable to show advanced data. Required tool glxinfo missing.
Audio: Device-1: NVIDIA vendor: Gigabyte driver: snd_hda_intel v: kernel bus ID: 2d:00.1 chip ID: 10de:10f8
Device-2: Advanced Micro Devices [AMD] Starship/Matisse HD Audio vendor: Micro-Star MSI driver: snd_hda_intel
v: kernel bus ID: 2f:00.4 chip ID: 1022:1487
Sound Server: ALSA v: k5.3.8-3-MANJARO
Network: Device-1: Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet vendor: Micro-Star MSI driver: r8169 v: kernel
port: d000 bus ID: 27:00.0 chip ID: 10ec:8168
IF: enp39s0 state: down mac: <filter>
Device-2: Intel Dual Band Wireless-AC 3168NGW [Stone Peak] driver: iwlwifi v: kernel port: d000 bus ID: 29:00.0
chip ID: 8086:24fb
IF: wlp41s0 state: up mac: <filter>
Drives: Local Storage: total: 2.84 TiB used: 90.16 GiB (3.1%)
ID-1: /dev/nvme0n1 vendor: Corsair model: Force MP300 size: 111.79 GiB speed: 15.8 Gb/s lanes: 2 serial: <filter>
rev: E8FM12.0 scheme: GPT
ID-2: /dev/nvme1n1 vendor: Samsung model: SSD 970 EVO 250GB size: 232.89 GiB speed: 31.6 Gb/s lanes: 4
serial: <filter> rev: 2B2QEXE7 scheme: GPT
ID-3: /dev/sda vendor: Western Digital model: WD30EZRZ-00GXCB0 size: 2.73 TiB speed: 6.0 Gb/s rotation: 5400 rpm
serial: <filter> rev: 0A80 scheme: GPT
Partition: ID-1: / size: 227.74 GiB used: 64.62 GiB (28.4%) fs: ext4 dev: /dev/nvme1n1p2
Sensors: System Temperatures: cpu: 67.2 C mobo: 32.0 C gpu: nvidia temp: 53 C
Fan Speeds (RPM): fan-1: 0 fan-2: 2017 fan-3: 0 fan-4: 0 fan-5: 0 fan-6: 0 fan-7: 0 gpu: nvidia fan: 30%
Info: Processes: 406 Uptime: 4d 14h 26m Memory: 31.37 GiB used: 6.09 GiB (19.4%) Init: systemd v: 242 Compilers:
gcc: 9.2.0 Shell: zsh v: 5.7.1 running in: tty 0 inxi: 3.0.36
我听说糟糕的 CNN 性能可能是由于数据存储在 HDD 而不是 SSD 上造成的,但我在两个驱动器上实现了相同的性能。
你有什么主意吗?
谢谢
我也得到了这个——在我的 RTX2080 和我的 GTX960 上做到了。 如果我使用我的 Quadro M2000 或 P4000 卡,它会进入 P0 模式,全速。 NVidia 是否故意限制其非“专业”卡的 TF 性能以促使人们付费?
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.