简体   繁体   English

为什么 Tensorflow 不在 GPU 上运行,而 GPU 设备在 Z23EEEB4347BDD26BFCZ6B7EE9A3B755DD 中被识别?

[英]Why Tensorflow not running on GPU while GPU devices are identified in python?

I installed TensorFlow 2.2.0 and TensorFlow-gpu 2.2.0 in windows 10 .我在windows 10中安装了TensorFlow 2.2.0TensorFlow-gpu 2.2.0 Also, I installed CUDA Toolkit v10.1 and copy cuDNN 7.6.5 files in CUDA directories .另外,我安装了CUDA Toolkit v10.1并将cuDNN 7.6.5文件复制到CUDA directories中。 My GPU is NVIDIA GeForce 940 MX .我的 GPU 是NVIDIA GeForce 940 MX In addition, I set CUDA Path on windows.另外,我在 windows 上设置了 CUDA 路径。 When I test devices through the below code, both CPU and GPU are recognized:当我通过以下代码测试设备时, CPUGPU都被识别:

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

The output is: output 是:

[name: "/device:CPU:0"
 device_type: "CPU"
 memory_limit: 268435456
 locality {
 }
 incarnation: 13265748925766868529,
 name: "/device:XLA_CPU:0"
 device_type: "XLA_CPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 14569071601529958377
 physical_device_desc: "device: XLA_CPU device",
 name: "/device:XLA_GPU:0"
 device_type: "XLA_GPU"
 memory_limit: 17179869184
 locality {
 }
 incarnation: 15045400394346252324
 physical_device_desc: "device: XLA_GPU device"]

But, when I run my code, it seems the codes are run just on CPU.但是,当我运行我的代码时,这些代码似乎只在 CPU 上运行。 In addition, when I test GPU availability with tf.test.is_gpu_available() , GPU devices cannot be recognized and False value is shown.此外,当我使用tf.test.is_gpu_available()测试 GPU 可用性时,无法识别 GPU 设备并显示False value
Or when we run tf.config.list_physical_devices('GPU') , an empty list or [] is printed.或者当我们运行tf.config.list_physical_devices('GPU')时,会打印一个empty list[] And when I run tf.config.experimental.list_physical_devices() , these three physical devices are shown in a list:当我运行tf.config.experimental.list_physical_devices()时,这三个physical devices显示在一个列表中:

[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'),
 PhysicalDevice(name='/physical_device:XLA_CPU:0', device_type='XLA_CPU'),
 PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU')]

It is important that when I run tf.config.list_physical_devices('XLA_GPU') , this will be printed: [PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU')]重要的是,当我运行tf.config.list_physical_devices('XLA_GPU')时,将打印: [PhysicalDevice(name='/physical_device:XLA_GPU:0', device_type='XLA_GPU')]

Also, when we run the code, task manager show that CPU use 96% of its capability and GPU use only 1% of its capability.此外,当我们运行代码时,任务管理器显示CPU 使用了 96% 的能力,而 GPU 仅使用了 1% 的能力。

The code we run is as follow:我们运行的代码如下:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout, Bidirectional
from tensorflow.keras.callbacks import ModelCheckpoint, TensorBoard
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from yahoo_fin import stock_info as si
from collections import deque

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import time
import os
import random


# set seed, so we can get the same results after rerunning several times
np.random.seed(314)
tf.random.set_seed(314)
random.seed(314)


def load_data(ticker, n_steps=50, scale=True, shuffle=True, lookup_step=1, 
                test_size=0.2, feature_columns=['adjclose', 'volume', 'open', 'high', 'low']):
    # see if ticker is already a loaded stock from yahoo finance
    if isinstance(ticker, str):
        # load it from yahoo_fin library
        df = si.get_data(ticker)
    elif isinstance(ticker, pd.DataFrame):
        # already loaded, use it directly
        df = ticker
    # this will contain all the elements we want to return from this function
    result = {}
    # we will also return the original dataframe itself
    result['df'] = df.copy()
    # make sure that the passed feature_columns exist in the dataframe
    for col in feature_columns:
        assert col in df.columns, f"'{col}' does not exist in the dataframe."
    if scale:
        column_scaler = {}
        # scale the data (prices) from 0 to 1
        for column in feature_columns:
            scaler = preprocessing.MinMaxScaler()
            df[column] = scaler.fit_transform(np.expand_dims(df[column].values, axis=1))
            column_scaler[column] = scaler

        # add the MinMaxScaler instances to the result returned
        result["column_scaler"] = column_scaler
    # add the target column (label) by shifting by `lookup_step`
    df['future'] = df['adjclose'].shift(-lookup_step)
    # last `lookup_step` columns contains NaN in future column
    # get them before droping NaNs
    last_sequence = np.array(df[feature_columns].tail(lookup_step))
    # drop NaNs
    df.dropna(inplace=True)
    sequence_data = []
    sequences = deque(maxlen=n_steps)
    for entry, target in zip(df[feature_columns].values, df['future'].values):
        sequences.append(entry)
        if len(sequences) == n_steps:
            sequence_data.append([np.array(sequences), target])
    # get the last sequence by appending the last `n_step` sequence with `lookup_step` sequence
    # for instance, if n_steps=50 and lookup_step=10, last_sequence should be of 59 (that is 50+10-1) length
    # this last_sequence will be used to predict in future dates that are not available in the dataset
    last_sequence = list(sequences) + list(last_sequence)
    # shift the last sequence by -1
    last_sequence = np.array(pd.DataFrame(last_sequence).shift(-1).dropna())
    # add to result
    result['last_sequence'] = last_sequence
    # construct the X's and y's
    X, y = [], []
    for seq, target in sequence_data:
        X.append(seq)
        y.append(target)
    # convert to numpy arrays
    X = np.array(X)
    y = np.array(y)
    # reshape X to fit the neural network
    X = X.reshape((X.shape[0], X.shape[2], X.shape[1]))
    # split the dataset
    result["X_train"], result["X_test"], result["y_train"], result["y_test"] = train_test_split(X, y, test_size=test_size, shuffle=shuffle)
    # return the result
    return result


def create_model(sequence_length, units=256, cell=LSTM, n_layers=2, dropout=0.3,
                loss="mean_absolute_error", optimizer="rmsprop", bidirectional=False):
    model = Sequential()
    for i in range(n_layers):
        if i == 0:
            # first layer
            if bidirectional:
                model.add(Bidirectional(cell(units, return_sequences=True), input_shape=(None, sequence_length)))
            else:
                model.add(cell(units, return_sequences=True, input_shape=(None, sequence_length)))
        elif i == n_layers - 1:
            # last layer
            if bidirectional:
                model.add(Bidirectional(cell(units, return_sequences=False)))
            else:
                model.add(cell(units, return_sequences=False))
        else:
            # hidden layers
            if bidirectional:
                model.add(Bidirectional(cell(units, return_sequences=True)))
            else:
                model.add(cell(units, return_sequences=True))
        # add dropout after each layer
        model.add(Dropout(dropout))
    model.add(Dense(1, activation="linear"))
    model.compile(loss=loss, metrics=["mean_absolute_error"], optimizer=optimizer)
    return model

# Window size or the sequence length
N_STEPS = 100
# Lookup step, 1 is the next day
LOOKUP_STEP = 1
# test ratio size, 0.2 is 20%
TEST_SIZE = 0.2
# features to use
FEATURE_COLUMNS = ["adjclose", "volume", "open", "high", "low"]
# date now
date_now = time.strftime("%Y-%m-%d")
### model parameters
N_LAYERS = 3
# LSTM cell
CELL = LSTM
# 256 LSTM neurons
UNITS = 256
# 40% dropout
DROPOUT = 0.4
# whether to use bidirectional RNNs
BIDIRECTIONAL = False
### training parameters
# mean absolute error loss
# LOSS = "mae"
# huber loss
LOSS = "huber_loss"
OPTIMIZER = "adam"
BATCH_SIZE = 64
EPOCHS = 400
# Apple stock market
ticker = "AAPL"
ticker_data_filename = os.path.join("data", f"{ticker}_{date_now}.csv")
# model name to save, making it as unique as possible based on parameters
model_name = f"{date_now}_{ticker}-{LOSS}-{OPTIMIZER}-{CELL.__name__}-seq-{N_STEPS}-step-{LOOKUP_STEP}-layers-{N_LAYERS}-units-{UNITS}"
if BIDIRECTIONAL:
    model_name += "-b"
    
# create these folders if they does not exist
if not os.path.isdir("results"):
    os.mkdir("results")
if not os.path.isdir("logs"):
    os.mkdir("logs")
if not os.path.isdir("data"):
    os.mkdir("data")


# load the data
data = load_data(ticker, N_STEPS, lookup_step=LOOKUP_STEP, test_size=TEST_SIZE, feature_columns=FEATURE_COLUMNS)

# save the dataframe
data["df"].to_csv(ticker_data_filename)

# construct the model
model = create_model(N_STEPS, loss=LOSS, units=UNITS, cell=CELL, n_layers=N_LAYERS,
                    dropout=DROPOUT, optimizer=OPTIMIZER, bidirectional=BIDIRECTIONAL)

# some tensorflow callbacks
checkpointer = ModelCheckpoint(os.path.join("results", model_name + ".h5"), save_weights_only=True, save_best_only=True, verbose=1)
tensorboard = TensorBoard(log_dir=os.path.join("logs", model_name))

history = model.fit(data["X_train"], data["y_train"],
                    batch_size=BATCH_SIZE,
                    epochs=EPOCHS,
                    validation_data=(data["X_test"], data["y_test"]),
                    callbacks=[checkpointer, tensorboard],
                    verbose=1)

model.save(os.path.join("results", model_name) + ".h5")

**GPU and CPU performance** are as follow: **GPU and CPU performance**如下:

在此处输入图像描述

Could you help me, please?请问你能帮帮我吗?

I solve this problem with set NVIDIA Control Panel .我用 set NVIDIA Control Panel解决了这个问题。 I press right click on desktop and choose NVIDIA Control panel :我在桌面上按右键并选择NVIDIA Control panel 英伟达控制面板


Then, Through Set PhysX Configuration , I go to Select a PhysX Processor and select Auto-Select recommended like this:然后,通过Set PhysX Configuration ,我 go 到Select a PhysX Processor和 select Auto-Select recommended如下: 推荐自动选择


Also, from Manage 3D settings , I restored the settings by clicking on Restore button :此外,从Manage 3D settings ,我通过单击Restore button恢复了设置: 恢复 Also, you can set Python on GPU from Program Settings of this section.此外,您可以从本节的Program Settings中设置 GPU 上的 Python。 I did that.Please, apply all changes in every stage.我做到了。请在每个阶段应用所有更改。 Finally, with run each of above codes favorable results were shown like this:最后,运行上述每个代码的有利结果如下所示:


Code1:
from tensorflow.python.client import device_lib
device_lib.list_local_devices()
Output1:
[name: "/device:CPU:0"
device_type: "CPU"
memory_limit: 268435456
locality {
}
incarnation: 12330560057435677891
, name: "/device:XLA_CPU:0"
device_type: "XLA_CPU"
memory_limit: 17179869184
locality {
}
incarnation: 14076398930644318194
physical_device_desc: "device: XLA_CPU device"
, name: "/device:GPU:0"
device_type: "GPU"
memory_limit: 3186897715
locality {
  bus_id: 1
  links {
  }
}
incarnation: 5889399188264267952
physical_device_desc: "device: 0, name: GeForce 940MX, pci bus id: 0000:01:00.0, compute capability: 5.0"
, name: "/device:XLA_GPU:0"
device_type: "XLA_GPU"
memory_limit: 17179869184
locality {
}
incarnation: 8080361800351872259
physical_device_desc: "device: XLA_GPU device"
]

Code2:
import tensorflow as tf
tf.config.list_physical_devices('GPU')

Output2:
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Code3:
tf.test.is_gpu_available()
Output3:
True

The code should be like this:代码应该是这样的:

def load_data(ticker, n_steps=N_STEPS)

so you can pass the variable you defined as 100 ahead in the code.所以你可以在代码中传递你定义为 100 的变量。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM