简体   繁体   中英

Convolutional Neural Networks batch size

Question1.

Let us say the image have a shape of (batchsize=100,height=28,width=28,channel=1)

and if we put this image in the model CNN underneath,

class CNN(torch.nn.Module):
    
        def __init__(self):
            super().__init__()
           
            # ImgIn shape=(100, 28, 28, 1)
            #   Image shape after Conv     -> (100, 28, 28, 32)
            #    Image shape after Pool     -> (100, 14, 14, 32)
            self.layer1 = torch.nn.Sequential(
                torch.nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1),
                torch.nn.ReLU(),
                torch.nn.MaxPool2d(kernel_size=2, stride=2))
    
            # second layer
            # ImgIn shape=(100, 14, 14, 32)
            #   Image shape after Conv      ->(100, 14, 14, 64)
            #   Image shape after Pool      ->(100, 7, 7, 64)
            self.layer2 = torch.nn.Sequential(
                torch.nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
                torch.nn.ReLU(),
                torch.nn.MaxPool2d(kernel_size=2, stride=2))
    
            # THESE PART CONFUSE!!!
            self.fc = torch.nn.Linear(7 * 7 * 64, 10, bias=True)
    
           
            torch.nn.init.xavier_uniform_(self.fc.weight)
    
        def forward(self, x):
            out = self.layer1(x)
            out = self.layer2(out)
            out = out.view(out.size(0), -1)  
            out = self.fc(out)
            return out

WHAT HAPPENS to the batchsize of the image after self.fc.

Is it size(batch_size=100,10)?

Qustion 2.

Also, I am confused of mini-batch. If there is 5 mini batch, and the loss is [5,-5,4,-4,0], then the average loss will be zero, then will the neural network stop training even if there is loss in each mini-batch?

Qustion 3. Can neural network be expressed my complex matrix multiplication?

Q1

self.fc is just a single linear layer. The key line here is out.view(out.size(0), -1) which is nothing but flatten (reshape in NumPy), where out.size(0) is your batch size and -1 is all elements of the tensor (See torch.view ).

In other words, this line transforms your 4d (Batch, C, H, W) Conv layer into 2d (Batch, 7 * 7 * 64) . Finally, your fc layer outputs (Batch, 10)

Q2

This question is confusing. How did you get this loss? It seems, that there is a missing part of your code.

Q3

A linear multilayer perceptron is actually a matrix multiplication. We can apply nonlinear activation function to the output of each layer (which is a result of matrix multiplication).

CNN can also be seen as a kind of matrix operation. This type of operation is called convolution

So, since you have a complex network, you cannot express it as a single matrix multiplication.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM