简体   繁体   English

多层神经网络对具有 pyTorch 错误的图像进行分类

[英]Multi-Layer Neural Network to classify images with pyTorch Error

I need to classify flower images into 7 classes.我需要将花卉图像分为 7 类。 I initialized batch_size as 128 and image_size as 32. It is obligatory to use Linear layers so I created a Multi Layer Neural Network class as it is seen below:我将 batch_size 初始化为 128,将 image_size 初始化为 32。必须使用线性层,因此我创建了一个多层神经网络 class,如下所示:

number_of_classes = 7
input_dim = image_size
num_hidden = [batch_size * 3 * image_size ,4,number_of_classes]
output_dim = 1

class MultiLayerNeuralNetwork(nn.Module):

    def __init__(self,input_dim, num_hidden, output_dim):
        super(MultiLayerNeuralNetwork, self).__init__()

        #first layer
        self.first_layer = nn.Linear(input_dim, num_hidden[0])

        # initialize weight and bias
        nn.init.kaiming_uniform_(self.first_layer.weight, nonlinearity="relu")
        nn.init.constant_(self.first_layer.bias.data, 0)
        print(self.first_layer)
        #initialize hidden layers 
        self.hidden = nn.ModuleList()
        for k in range(len(num_hidden)-1):
            self.hidden.append(nn.Linear(num_hidden[k], num_hidden[k+1]))

            # initialize weight and bias in hidden layer
            nn.init.kaiming_uniform_(self.hidden[k].weight, nonlinearity="relu")
            nn.init.constant_(self.hidden[k].bias.data, 0)
            print(self.hidden[k])

        # output layer
        self.output_layer = nn.Linear(num_hidden[-1], output_dim)

        # initialize weight and bias
        nn.init.kaiming_uniform_(self.output_layer.weight, nonlinearity="relu")
        nn.init.constant_(self.output_layer.bias.data, 0)

        print(self.output_layer)
       
    def forward(self, x):
        x = torch.nn.functional.relu(self.first_layer(x))
        print("x:", x.shape)
        for layer in self.hidden:
            x = torch.nn.functional.relu(layer(x))
        x = torch.nn.functional.sigmoid(self.output_layer(x))

        return x
       
multi_layer_nn_model = MultiLayerNeuralNetwork(input_dim, num_hidden, output_dim)

print(multi_layer_nn_model)

But, when I tried to train this model, I got this error message: "only batches of spatial targets supported (3D tensors) but got targets of size: : [128]"但是,当我尝试训练这个 model 时,我收到了这条错误消息:“仅支持批量空间目标(3D 张量)但获得了大小目标::[128]”

At this line:在这一行:

loss = loss_function(outputs, labels)

I got these shapes:我有这些形状:

outputs torch.Size([128, 3, 32, 1])输出 torch.Size([128, 3, 32, 1])

labels torch.Size([128])标签 torch.Size([128])

How can I handle this situation?我该如何处理这种情况?

Note: Also, my outputs are between [0,2], but they have to be between [1-7].注意:另外,我的输出在 [0,2] 之间,但它们必须在 [1-7] 之间。 How can I handle this too?我该如何处理呢?

Thanks a lot.非常感谢。

Here are some fixes:以下是一些修复:

  1. Output dimention is set to 1, so final layer projects into 1 dimention instead of 7. Output 维度设置为 1,因此最后一层投影到 1 维度而不是 7。
  2. Since you are using just linear layers, the input needs to be converted into 2 dimentional vector of shape: [batch_size, 3 * image_width * image_height]由于您只使用线性层,因此需要将输入转换为二维形状向量: [batch_size, 3 * image_width * image_height]
  3. The layers need to be modified accordingly as well.这些层也需要相应地修改。

Below is a modified version of your code snippet.以下是您的代码段的修改版本。 Modify it according to your needs.根据您的需要修改它。

import torch.nn as nn
import torch

number_of_classes = 7
batch_size, image_size = 128, 32
input_dim = 3 * image_size * image_size
num_hidden = [3 * image_size, 4, number_of_classes]
output_dim = 7

class MultiLayerNeuralNetwork(nn.Module):

    def __init__(self,input_dim, num_hidden, output_dim):
        super(MultiLayerNeuralNetwork, self).__init__()

        #first layer
        self.first_layer = nn.Linear(input_dim, num_hidden[0])

        # initialize weight and bias
        nn.init.kaiming_uniform_(self.first_layer.weight, nonlinearity="relu")
        nn.init.constant_(self.first_layer.bias.data, 0)
        # print(self.first_layer)
        #initialize hidden layers 
        self.hidden = nn.ModuleList()
        for k in range(len(num_hidden)-1):
            self.hidden.append(nn.Linear(num_hidden[k], num_hidden[k+1]))

            # initialize weight and bias in hidden layer
            nn.init.kaiming_uniform_(self.hidden[k].weight, nonlinearity="relu")
            nn.init.constant_(self.hidden[k].bias.data, 0)
            print(self.hidden[k])

        # output layer
        self.output_layer = nn.Linear(num_hidden[-1], output_dim)

        # initialize weight and bias
        nn.init.kaiming_uniform_(self.output_layer.weight, nonlinearity="relu")
        nn.init.constant_(self.output_layer.bias.data, 0)

        print(self.output_layer)
       
    def forward(self, x):
        cur_shape = x.shape
        print(cur_shape, 'is shape')
        x = torch.nn.functional.relu(self.first_layer(x.view(cur_shape[0], -1)))
        print("x:", x.shape)
        for layer in self.hidden:
            x = torch.nn.functional.relu(layer(x))
        print(f'Before sigmoid: {self.output_layer(x).shape}')
        x = torch.nn.functional.sigmoid(self.output_layer(x))

        return x
       
multi_layer_nn_model = MultiLayerNeuralNetwork(input_dim, num_hidden, output_dim)

# print(multi_layer_nn_model)
x = torch.rand(batch_size, 3, image_size, image_size)
y = multi_layer_nn_model(x)
print('x', x.shape, 'y', y.shape)

Gives the following output:给出以下 output:

Linear(in_features=96, out_features=4, bias=True)
Linear(in_features=4, out_features=7, bias=True)
Linear(in_features=7, out_features=7, bias=True)
torch.Size([128, 3, 32, 32]) is shape
x: torch.Size([128, 96])
Before sigmoid: torch.Size([128, 7])

x torch.Size([128, 3, 32, 32]) y torch.Size([128, 7])

PS.附言。 I feel crossentropy loss might be a better choice for this case.我觉得对于这种情况,交叉熵损失可能是更好的选择。

I think it should help.我认为它应该有所帮助。 I have included comments in the code for brevity.为了简洁起见,我在代码中加入了注释。 I have used CrossEntropy loss which is generally used for Multiclass classification problems.我使用了通常用于多类分类问题的 CrossEntropy 损失。 Notice that NLLLoss can also be used but it operates on the output of log_softmax.请注意,也可以使用 NLLLoss,但它在 log_softmax 的 output 上运行。 Hopefully it helps.希望它有所帮助。

Setting up the necessary parameters设置必要的参数

number_of_classes = 7
input_dim = 1024
image_size = (32, 32)
# change the batch size if required
batch_size = 8
# you can change the num_hidden size according to the requirements
num_hidden = [input_dim*3 ,input_dim*4, input_dim, 256, 64]

Model Definition Model 定义

class MultiLayerNeuralNetwork(nn.Module):
    def __init__(self,input_dim, num_hidden, output_dim):
        super(MultiLayerNeuralNetwork, self).__init__()
        #first layer
        self.first_layer = nn.Linear(input_dim, num_hidden[0])
        # initialize weight and bias
        nn.init.kaiming_uniform_(self.first_layer.weight, 
        nonlinearity="relu")
        nn.init.constant_(self.first_layer.bias.data, 0)

        #initialize hidden layers 
        self.hidden = nn.ModuleList()
        for k in range(0, len(num_hidden)-1):
            self.hidden.append(nn.Linear(num_hidden[k], num_hidden[k+1]))
            # initialize weight and bias in hidden layer
            nn.init.kaiming_uniform_(self.hidden[k].weight, 
            nonlinearity="relu")
            nn.init.constant_(self.hidden[k].bias.data, 0)
  

        # output layer
        self.output_layer = nn.Linear(num_hidden[-1], output_dim)

        # initialize weight and bias
        nn.init.kaiming_uniform_(self.output_layer.weight, 
        nonlinearity="relu")
        nn.init.constant_(self.output_layer.bias.data, 0)


    def forward(self, x):
        x = self.first_layer(x)
        for layer in self.hidden:
            x = layer(x)
        return self.output_layer(x)

Model Instance Declaration Model 实例声明

# number of classes are 7, so is the output_dim
output_dim = 7
model = MultiLayerNeuralNetwork(input_dim, num_hidden, output_dim)
print(model)
Model Summary Model 总结
MultiLayerNeuralNetwork(
(first_layer): Linear(in_features=1024, out_features=3072, bias=True)
(hidden): ModuleList(
    (0): Linear(in_features=3072, out_features=4096, bias=True)
    (1): Linear(in_features=4096, out_features=1024, bias=True)
    (2): Linear(in_features=1024, out_features=256, bias=True)
    (3): Linear(in_features=256, out_features=64, bias=True)
)
(output_layer): Linear(in_features=64, out_features=7, bias=True)
)
Feeding a batch to the model喂一批到model
samples = torch.from_numpy(np.random.randn(8, 32, 32)).float()
labels = torch.tensor([1, 5, 1, 2, 4, 1, 3, 6]).long()

# notice that input should be vectorized while batch dimension is left 
alone
# input shape is torch.Size([8, 1024])
samples = samples.view(sample.shape[0], -1)
# Forward Pass
output = model(samples)
# output shape is torch.Size([8, 7]), 
# 1st dimension represent the batch dimension, A batch of size 8
# notice the 2nd dimension, 7 classes.
 
Calculating the loss计算损失
loss_fn = nn.CrossEntropyLoss()
loss = loss_fn(output, labels)
# loss for our dummy batch comes out to be tensor(10.3904, grad_fn= 
# <NllLossBackward0>)
Prediction预言
preds = torch.argmax(torch.softmax(output, axis=1), axis=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM