简体   繁体   中英

Create an “unpooling” mask from output layers in Keras

I'm writing a CNN in Keras with Tensorflow backend. I'm trying to create an "unpooling" mask (or pooling indices's) like described here: https://arxiv.org/pdf/1511.00561.pdf

I've built a CNN without this unpooling mask and it works fine. I create the mask the following way (this is just a part of the bigger net, the same idea at every conv/maxpooling block):

img_input = Input(shape=(num_channels, img_h, img_w))
x = conv_block(img_input, kernel, 512)
orig = x #Save output x
x = MaxPooling2D()(x)

x = UpSampling2D()(x)

bool_mask = K.greater_equal(orig, x)
mask = K.cast(bool_mask, dtype='float32')

mask_input = Input(tensor=mask) # Makes the mask to a Keras tensor to use as input
x = keras.layers.multiply([mask_input, x])
x = deconv_block(x, kernel, 512, 512)

x = Reshape((n_labels, img_h * img_w))(x)
x = Permute((2, 1))(x)
main_output = Activation('softmax')(x)

model = Model(inputs=img_input, outputs=main_output)

Since I create the "second input" mask_input from other layers I don't want to have it as a model input. But if I don't I can't create the model. If I change the last row to:

model = Model(inputs=[img_input, mask_input], outputs=main_output)

I can now create the model but when I want to use it I need a second input which I don't have until I've created it.

Does anyone have a different solution to create a unpooling-mask or know how to work around the problem with several inputs?

I would put all operations inside layers, which is what the model expects (I assumed the functions conv_block and deconv_block are entirely made of layers, otherwise, they should go into a Lambda layer as well).

You don't need that processed x to be an input. You can split your model as you did, and then merge it again, making parallel branches.

I couldn't test with your data and your dimensions, but on a simple test I ran here about the concatenation, it works. (I tested in theano, since I don't have tensorflow. I hope everything will work ok... but perhaps you should experiment different axes on concatenation and greater_equal)

img_input = Input(shape=(num_channels, img_h, img_w))

x = conv_block(img_input, kernel, 512)

orig = x #Save output x
x = MaxPooling2D()(x)

x = UpSampling2D()(x)

#here we're going to reshape the data for a concatenation:
#xReshaped and origReshaped are now split branches
xReshaped = Reshape((1,channels_after_conv_block, h_after, w_after))(x)
origReshaped = Reshape((1,channels_after_conv_block, h_after, w_after))(orig)

#concatenation - here, you unite both branches again
    #normally you don't need to reshape or use the axis var, 
    #but here we want to keep track of what was x and what was orig.
together = Concatenate(axis=1)([origReshaped,xReshaped])

bool_mask = Lambda(lambda t: K.greater_equal(t[:,0], t[:,1]),
    output_shape=(channels_after_conv_block, h_after, w_after))(together)
mask = Lambda(lambda t: K.cast(t, dtype='float32'))(bool_mask)

x = Multiply()([mask, x])
x = deconv_block(x, kernel, 512, 512)

x = Reshape((n_labels, img_h * img_w))(x)
x = Permute((2, 1))(x)
main_output = Activation('softmax')(x)

model = Model(inputs=img_input, outputs=main_output)

Here is the simple test I ran with MNIST data:

inp1 = Input((28,28))
inp2 = Input((28,28))
in1 = Reshape((1,28,28))(inp1)
in2 = Reshape((1,28,28))(inp2)
c = Concatenate(axis =1)([in1,in2])

#here, the lambda expression sums two MNIST images
c = Lambda(lambda x: x[:,0]+x[:,1],output_shape=(28,28))(c)

m = Model([inp1,inp2],c)

res = m.predict([xTraining[:10],xTraining[10:20]])
print(res.shape)

#if you plot the res, you will see the images superposed, which was the expected result.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM