简体   繁体   English

如何在 NVIDIA gpu 上运行 tensorflow?

[英]How to run tensorflow on NVIDIA gpu?

I'm trying to run a TensorFlow model, for the first time, on a NVIDIA Titan RTX, but I'm getting some errors.我第一次尝试在 NVIDIA Titan RTX 上运行 TensorFlow model,但我遇到了一些错误。

CUDA version CUDA版

$ cat /usr/local/cuda/version.json
{
   "cuda" : {
      "name" : "CUDA SDK",
      "version" : "11.3.20210326"
   },
...

python3.9.1 and tensorflow2.5.0-rc1 python3.9.1 和 tensorflow2.5.0-rc1

Traceback (most recent call last):
  File "/home/marcus/COVID-19-forecasting/COVID-19/run_experiments.py", line 23, in <module>
    exp.run_experiments(dat.horizon, dat.pad_val, dat.padded_scaled_train, dat.multi_out_scaled_val, dat.padded_scaled_test_x,
  File "/home/marcus/COVID-19-forecasting/COVID-19/experiment.py", line 110, in run_experiments
    lstm_hist = lstm.fit([tr, enc_names], [v[0], v[1], v[2]], self.epochs, verbose=0)
  File "/home/marcus/COVID-19-forecasting/COVID-19/model.py", line 55, in fit
    return self.model.fit(x=x, y=y, epochs=epochs, callbacks=callbacks, verbose=verbose)
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/keras/engine/training.py", line 1183, in fit
    tmp_logs = self.train_function(iterator)
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 889, in __call__
    result = self._call(*args, **kwds)
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/def_function.py", line 950, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 3023, in __call__
    return graph_function._call_flat(
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 1960, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/function.py", line 591, in call
    outputs = execute.execute(
  File "/home/marcus/COVID-19-forecasting/covid-venv/lib/python3.9/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.UnknownError:    Fail to find the dnn implementation.
         [[{{node cond_40/then/_0/cond/CudnnRNNV3}}]]
         [[multi_output_rnn/encoder_block/rnn_encoder/PartitionedCall]] [Op:__inference_train_function_6309]

Function call stack:
train_function -> train_function -> train_function

I tried adding these lines to my code but nothing changed.我尝试将这些行添加到我的代码中,但没有任何改变。

physical_devices = tf.config.experimental.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], enable=True)

I'm not sure if this is a bug or a problem with the computer I'm using, but python3.9 uses TensorFlow2.5 and these versions do not seem to work on the GRU.我不确定这是我正在使用的计算机的错误还是问题,但 python3.9 使用 TensorFlow2.5,这些版本似乎不适用于 GRU。

My solution was to install python3.8, then, inside a venv, I installed TensorFlow2.4 and my script worked.我的解决方案是安装 python3.8,然后,在 venv 中,我安装了 TensorFlow2.4,我的脚本就可以工作了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM