简体   繁体   中英

How do I store an intermediate convolutional layer's result in tensorflow for later processing?

The image below describes the output before the application of a max-pooling layer of a single intermediate filter layer of a CNN. I want to store the co-ordinates of the pixel with intensity 4(on the bottom right of the matrix on the LHS of the arrow) as it is in the matrix on the LHS of the arrow. That is the pixel at co-ordinate (4,4)(1 based indexing)in the right matrix is the one which is getting stored in the bottom right cell of the matrix on the RHS of the arrow, right. Now what I want to do is to store this co-ordinate value (4,4) along with the co-ordinates for the other pixels {(2,2) for pixel with intensity 6, (2, 4) for pixel with intensity 8 and (3, 1) for pixel with intensity 3} as a list for later processing. How do I do it in Tensorflow. 使用大小为 2 x 2 且步幅为 2 的过滤器完成最大池化

Max pooling done with a filter of size 2 x 2 and stride of 2

You can use tf.nn.max_pool_with_argmax ( link ). Note:

The indices in argmax are flattened, so that a maximum value at position [b, y, x, c] becomes flattened index ((b * height + y) * width + x) * channels + c.

We need to do some processing to make it fit your coordinates. An example:

import tensorflow as tf
import numpy as np

def max_pool_with_argmax(net,filter_h,filter_w,stride):
    output, mask = tf.nn.max_pool_with_argmax( net,ksize=[1, filter_h, filter_w, 1],
                                            strides=[1, stride, stride, 1],padding='SAME')

    # If your ksize looks like [1, stride, stride, 1]
    loc_x = mask // net.shape[2]
    loc_y = mask % net.shape[2]
    loc = tf.concat([loc_x+1,loc_y+1],axis=-1) #count from 0 so add 1

    # If your ksize is all changing, use the following
    # c = tf.mod(mask,net.shape[3])
    # remain = tf.cast(tf.divide(tf.subtract(mask,c),net.shape[3]),tf.int64)
    # x = tf.mod(remain,net.shape[2])
    # remain = tf.cast(tf.divide(tf.subtract(remain,x),net.shape[2]),tf.int64)
    # y = tf.mod(remain,net.shape[1])
    # remain = tf.cast(tf.divide(tf.subtract(remain, y), net.shape[1]),tf.int64)
    # b = tf.mod(remain, net.shape[0])
    # loc = tf.concat([y+1,x+1], axis=-1)
    return output,loc

input = tf.Variable(np.random.rand(1, 6, 4, 1), dtype=np.float32)
output, mask = max_pool_with_argmax(input,2,2,2)

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    input_value,output_value,mask_value = sess.run([input,output,mask])
    print(input_value[0,:,:,0])
    print(output_value[0,:,:,0])
    print(mask_value[0,:,:,:])

#print
[[0.20101677 0.09207255 0.32177696 0.34424785]
 [0.4116488  0.5965447  0.20575707 0.63288754]
 [0.3145412  0.16090539 0.59698933 0.709239  ]
 [0.00252096 0.18027237 0.11163216 0.40613824]
 [0.4027637  0.1995668  0.7462126  0.68812144]
 [0.8993007  0.55828506 0.5263306  0.09376772]]
[[0.5965447  0.63288754]
 [0.3145412  0.709239  ]
 [0.8993007  0.7462126 ]]
[[[2 2]
  [2 4]]

 [[3 1]
  [3 4]]

 [[6 1]
  [5 3]]]

You can see (2,2) for pixel with intensity 0.5965447, (2, 4) for pixel with intensity 0.63288754 and so on.

Let's say you have the following max-pooling layer:

pool_layer= tf.nn.max_pool(conv_output,
                           ksize=[1, 2, 2, 1],
                           strides=[1, 2, 2, 1],
                           padding='VALID')

you can use:

max_pos = tf.gradients([pool_layer], [conv_output])[0]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM