简体   繁体   English

pytorch不使用CUDA设备

[英]Pytorch not using cuda device

I have the following code: 我有以下代码:

from __future__ import print_function
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
import scipy.io

folder = 'small/'
mat = scipy.io.loadmat(folder+'INISTATE.mat');
ini_state = np.float32(mat['ini_state']);
ini_state = torch.from_numpy(ini_state);
ini_state = ini_state.cuda();

mat = scipy.io.loadmat(folder+'TARGET.mat');
target = np.float32(mat['target']);
target = torch.from_numpy(target);
target = target.cuda();

class MLPNet(nn.Module):
    def __init__(self):
        super(MLPNet, self).__init__()
        self.fc1 = nn.Linear(3, 64)
        self.fc2 = nn.Linear(64, 128)
        self.fc3 = nn.Linear(128, 128)
        self.fc4 = nn.Linear(128, 41)
    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = F.relu(self.fc3(x))
        x = self.fc4(x)
        return x

    def name(self):
        return "MLP"

model = MLPNet();
model = model.cuda();

criterion = nn.MSELoss();
criterion = criterion.cuda();
learning_rate = 0.001;
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) 

batch_size = 20
iter_size = int(target.size(0)/batch_size)
print(iter_size)

for epoch in range(50):
    for i in range(iter_size):  
        start = i*batch_size;
        end = (i+1)*batch_size-1;
        samples = ini_state[start:end,:];
        labels = target[start:end,:];

        optimizer.zero_grad()  # zero the gradient buffer
        outputs = model(samples)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        if (i+1) % 500 == 0:
            print("Epoch %s, batch %s, loss %s" % (epoch, i, loss))
    if (epoch+1) % 7 == 0: 
        for g in optimizer.param_groups:
            g['lr'] = g['lr']*0.1; 

But when I train the simple MLP, the CPU usage is around 100% while the gpu is only around 10%. 但是,当我训练简单的MLP时,CPU使用率约为100%,而GPU仅为10%左右。 What is the problem that prevents using the GPU? 导致无法使用GPU的问题是什么?

Actually your model indeed runs on GPU instead of CPU. 实际上,您的模型确实在GPU而不是CPU上运行。 The reason of low GPU usage is that both your model and batch size are small, which demands low computational cost. GPU使用率较低的原因是您的模型和批处理量都较小,因此需要较低的计算成本。 You may try increasing the batch size to around 1000, and the GPU usage should be higher. 您可以尝试将批处理大小增加到1000左右,并且GPU使用率应该更高。 In fact PyTorch prevents operations that mix CPU and GPU data, eg, you can't multiply a GPU tensor and a CPU tensor. 实际上,PyTorch会阻止混合CPU和GPU数据的操作,例如,您不能将GPU张量和CPU张量相乘。 So usually it is unlikely that part of your network runs on CPU and the other part runs on GPU, unless you deliberately design it. 因此,除非您有意设计,否则通常网络的一部分不太可能在CPU上运行而另一部分可能在GPU上运行。

By the way, data shuffling is necessary for neural networks. 顺便说一下,神经网络必须进行数据改组。 As your are using mini-batch training, in each iteration you are hoping that the mini batch approximates the whole dataset. 当您使用小批量训练时,您希望在每次迭代中小批量都能近似整个数据集。 Without data shuffling, it is likely that samples in a mini batch are highly correlated, which leads to biased estimation of parameter update. 如果不进行数据混排,则迷你批中的样本可能高度相关,这会导致参数更新的估计偏差。 The data loader provided by PyTorch can help you do the data shuffling. PyTorch提供的数据加载器可以帮助您进行数据改组。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM