简体   繁体   中英

how to use RGB values in feedforward neural network?

I have data set of colored images in the form of ndarray (100, 20, 20, 3) and 100 corresponding labels. When passing them as input to a fully connected neural network (not CNN), what should I do with the 3 values of RGB? Average them perhaps lose some information, but if not manipulating them, my main issue is batch size, as demo-ed below in pytorch.

for epoch in range(n_epochs):
    for i, (images, labels) in enumerate(train_loader):
        # because of rgb values, now images is 3 times the length of labels
        images = Variable(images.view(-1, 400))
        labels = Variable(labels)
        optimizer.zero_grad()
        outputs = net(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

This returns 'ValueError: Expected input batch_size (300) to match target batch_size (100).' Should I have reshaped images into (1, 1200) dimension tensors? Thanks in advance for answers.

Since size of labels is (100,) , so your batch data should be with shape of (100, H, W, C) . I'm assuming your data loader is returning a tensor with shape of (100,20,20,3) . The error happens because you reshape the tensor to (300,400) .

  1. Check your network architecture whether the input tensor shape is (20,20,3) .

  2. If your network can only accept single channel images, you can first convert your RGB to grayscale images.

  3. Or, modify your network architecture to make it accept 3 channels images. One convenient way is adding an extra layer reducing 3 channels to 1 channel, and you do not need to change the other parts of the network.

使用灰度图像减小批量大小

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM