简体   繁体   中英

How to define a Recurrent Convolutional network layer in CNTK?

I am new to CNTK, and using its awesome python API. I have problem figuring out how I may define a Recurrent Convolutional network layer since the Recurrence() seems to assume a regular network layer only.

Be more specific, I would like to have recurrence among convolutional layers.

Any pointer or even a simple example would be highly appreciated. Thank you.

There are two ways to do this in a meaningful way (ie without destroying the structure of natural images that convolutions rely on). The simplest is to just have an LSTM at the final layer ie

convnet = C.layers.Sequential([Convolution(...), MaxPooling(...), Convolution(...), ...])
z = C.layers.Sequential([convnet, C.layers.Recurrence(LSTM(100)), C.layers.Dense(10)])

for a 10-class problem.

The more complex way would be to define your own recurrent cell that only uses convolutions and thus respects the structure of natural images. To define a recurrent cell you need to write a function that takes the previous state and an input (ie a single frame if you are processing video) and outputs the next state and output. For example you can look into the implementation of the GRU in the CNTK layers module , and adapt it to use convolution instead of times everywhere. If this is what you want I can try to provide such an example. However, I encourage you to try the simple way first.

Update : I wrote a barebones convolutional GRU. You need to pay special attention to how the initial state is defined but otherwise it seems to work fine. Here's the layer definition

def ConvolutionalGRU(kernel_shape, outputs, activation=C.tanh, init=C.glorot_uniform(), init_bias=0, name=''):
    conv_filter_shape = (outputs, C.InferredDimension) + kernel_shape
    bias_shape = (outputs,1,1)
    # parameters
    bz = C.Parameter(bias_shape, init=init_bias, name='bz')  # bias
    br = C.Parameter(bias_shape, init=init_bias, name='br')  # bias
    bh = C.Parameter(bias_shape, init=init_bias, name='bc')  # bias
    Wz = C.Parameter(conv_filter_shape, init=init, name='Wz') # input
    Wr = C.Parameter(conv_filter_shape, init=init, name='Wr') # input
    Uz = C.Parameter(conv_filter_shape, init=init, name='Uz') # hidden-to-hidden
    Ur = C.Parameter(conv_filter_shape, init=init, name='Hz') # hidden-to-hidden
    Wh = C.Parameter(conv_filter_shape, init=init, name='Wc') # input
    Uh = C.Parameter(conv_filter_shape, init=init, name='Hc') # hidden-to-hidden
    # Convolutional GRU model function
    def conv_gru(dh, x):
        zt = C.sigmoid (bz + C.convolution(Wz, x) + C.convolution(Uz, dh))        # update gate z(t)
        rt = C.sigmoid (br + C.convolution(Wr, x) + C.convolution(Ur, dh))        # reset gate r(t)
        rs = dh * rt                                                            # hidden state after reset
        ht = zt * dh + (1-zt) * activation(bh + C.convolution(Wh, x) + C.convolution(Uh, rs))
        return ht
    return conv_gru

and here is how to use it

x = C.sequence.input_variable(3,224,224))
z = C.layers.Recurrence(ConvolutionalGRU((3,3), 32), initial_state=C.constant(0, (32,224,224)))
y = z(x)
x0 = np.random.randn(16,3,224,224).astype('f') # a single seq. with 16 random "frames"
output = y.eval({x:x0}) 
output[0].shape
(16, 32, 224, 224)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM