简体   繁体   中英

How to define a 2D convolution on tensors with rank greater than 4 in keras/tensorflow

I'm attempting to find a way to perform 2D convolutions over tensors which are of higher dimensionality than 4, which is the input rank required by keras.layers.Conv2D and keras.backend.conv2d . By this I mean, instead of having an input of size [batch, W, H, C] , I would like to be able to use [batch,<some_other_dimensions>, W, H, C] , and for those other dimensions to be treated essentially in the same way as 'batch' (eg unaffected by the convolution). Unsurprisingly, this throws an error, so reshaping the array seems like the most straightforward solution, however, there are issues with this (as described below).

When reshaping, I'm messing with what Keras sees as the batch dimension, so I need to use the keras.backend.reshape , rather than keras.layers.Reshape , which seems to not access the batch dimension of your data. By using a lambda layer and keras.backend.reshape , we can make the input arrays of size [batch*<some_other_dimensions>,W,H,C] and then reshape them back again after performing our convolution. However, it's essential that this layer can form part of a Fully Convolutional Network, that can work on arbitrary image sizes, which have undefined W and H (set as None when instantiating the Input layer's shape). Therefore, we end up passing a shape that has two undefined spatial dimensions to keras.backend.reshape , which it obviously can't use: [batch*<some_other_dimensions>,None,None,C] .

I can get it to work when explicitly declaring the width and height (as you can see in my code). However, I really don't want to sacrifice the ability to ingest arbitrarily sized spatial dimensions, as its important for my project to be able to do so.

The only other option I can think of is actually defining my own custom layer that can handle > 4 dimensional inputs for 2D convolutions. I don't really know where to start with that, but advice would be most welcome if people think that is the most viable route. Or is there perhaps a nifty lambda layer out there that solves my problem?

from keras.layers import Input, Conv2D, Lambda
import keras.backend as K
from keras.models import Model


def reshape_then_conv(data_shape):

    input = Input(shape=data_shape)
    #should equal (None, *data_shape) because batch is prepended as None
    print('      INPUT SHAPE: ', input.shape)

    #reshaping input into 4D
    reshaped = Lambda(lambda x: K.reshape(x,(-1, *input.shape[3:])))(input)
    print('    AFTER RESHAPE: ', reshaped.shape)

    #convolve new 4D tensor
    convolved = Conv2D(10,(3,3),strides=2)(reshaped)
    print('AFTER CONVOLUTION: ', convolved.shape)

    #reshaping back but keeping new spatial and channel dimensions from convolution
    reshaped_back = Lambda(lambda x: K.reshape(x,(-1,*input.shape[1:3],*convolved.shape[-3:])))(convolved)
    return Model(inputs=input,outputs=reshaped_back)



#images of size 100,100,3 in 4-by-4 set
layer = reshape_then_conv([4,4,100,100,3])
print('     OUTPUT SHAPE: ', layer.output_shape,'\n')
#images of undefined size in 4-by-4 set
layer = reshape_then_conv([4,4,None,None,3])
print('     OUTPUT SHAPE: ', layer.output_shape)

As expected, the first call to 'reshape_then_conv' works, as we explicitly set the width and height. However, the second example gives:

TypeError: Failed to convert object of type to Tensor. >Contents: (-1, Dimension(None), Dimension(None), Dimension(3)). Consider >casting elements to a supported type.

Thanks in advance for any insights you might have!

UPDATE

Thanks to @DMolony's answer, I've rearranged the code as follows...

from keras.layers import Input, Conv2D, Lambda
import keras.backend as K
from keras.models import Model

def reshape_then_conv(data_shape):

    input = Input(shape=data_shape)
    print('      INPUT SHAPE: ', input.shape)
    new_shape = K.concatenate((K.variable([-1],dtype='int32'),K.shape(input)[3:]))
    #reshaping input into 4D
    reshaped = Lambda(lambda x: K.reshape(x,new_shape))(input)
    print('    AFTER RESHAPE: ', reshaped.shape)

    #convolve new 4D tensor
    convolved = Conv2D(10,(3,3),strides=2)(reshaped)
    print('AFTER CONVOLUTION: ', convolved.shape)

    returning_shape = K.concatenate((K.variable([-1],dtype='int32'),K.shape(input)[1:3],K.shape(convolved)[-3:]))
    #reshaping back but keeping new spatial and channel dimensions from convolution
    reshaped_back = Lambda(lambda x: K.reshape(x,returning_shape))(convolved)
    return Model(inputs=input,outputs=reshaped_back)

#images of size 100,100,3 in 4-by-4 set
layer = reshape_then_conv([4,4,100,100,3])
print('     OUTPUT SHAPE: ', layer.output_shape,'\n')
#images of undefined size in 4-by-4 set
layer = reshape_then_conv([4,4,None,None,3])
print('     OUTPUT SHAPE: ', layer.output_shape)

When reshaping instead of using the static shape eg

input.shape[1:3]  

try providing the dynamic shape eg

K.shape(input)[1:3]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM