简体   繁体   English

将卷积运算应用于图像-PyTorch

[英]Applying convolution operation to image - PyTorch

To render an image if shape 27x35 I use : 若要渲染图像(形状为27x35),请使用:

random_image = []
for x in range(1 , 946):
        random_image.append(random.randint(0 , 255))

random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))

This generates : 这会产生:

在此处输入图片说明

I then try to apply a convolution to the image using the torch.nn.Conv2d : 然后,我尝试使用torch.nn.Conv2d将卷积应用于图像:

conv2 = torch.nn.Conv2d(3, 18, kernel_size=3, stride=1, padding=1)

image_d = np.asarray(random_image_arr.reshape(27 , 35))

conv2(torch.from_numpy(image_d))

But this displays error : 但这显示错误:

~/.local/lib/python3.6/site-packages/torch/nn/modules/conv.py in forward(self, input)
    299     def forward(self, input):
    300         return F.conv2d(input, self.weight, self.bias, self.stride,
--> 301                         self.padding, self.dilation, self.groups)
    302 
    303 

RuntimeError: input has less dimensions than expected

The shape of the input image_d is (27, 35) 输入image_d的形状为(27, 35) image_d (27, 35)

Should I change the parameters of Conv2d in order to apply the convolution to the image ? 为了将卷积应用于图像,是否应该更改Conv2d的参数?

Update. 更新。 From @McLawrence answer I have : 从@McLawrence答案我有:

random_image = []
for x in range(1 , 946):
        random_image.append(random.randint(0 , 255))

random_image_arr = np.array(random_image)
matplotlib.pyplot.imshow(random_image_arr.reshape(27 , 35))

This renders image : 这将渲染图像:

在此处输入图片说明

Applying the convolution operation : 应用卷积运算:

conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)

image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35))).numpy()

fc = conv2(torch.from_numpy(image_d))

matplotlib.pyplot.imshow(fc[0][0].data.numpy()) matplotlib.pyplot.imshow(fc [0] [0] .data.numpy())

renders image : 渲染图像:

在此处输入图片说明

There are two problems with your code: 您的代码有两个问题:

First, 2d convolutions in pytorch are defined only for 4d tensors. 首先,仅针对4d张量定义 pytorch中的2d卷积。 This is convenient for use in neural networks. 这在神经网络中使用很方便。 The first dimension is the batch size while the second dimension are the channels (a RGB image for example has three channels). 第一维是批次大小,而第二维是通道(例如,RGB图像具有三个通道)。 So you have to reshape your tensor like 所以你必须像重塑你的张量

image_d = torch.FloatTensor(np.asarray(random_image_arr.reshape(1, 1, 27 , 35)))

The FloatTensor is important here, since convolutions are not defined on the LongTensor which will be created automatically if your numpy array only includes int s. FloatTensor在这里很重要,因为在LongTensor上未定义卷积,如果您的numpy数组仅包含int ,则会自动创建卷积。

Secondly, You have created a convolution with three input channels, while your image has just one channel (it is greyscale). 其次,您创建了具有三个输入通道的卷积,而图像只有一个通道(灰度)。 So you have to adjust the convolution to: 因此,您必须将卷积调整为:

conv2 = torch.nn.Conv2d(1, 18, kernel_size=3, stride=1, padding=1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM