tensorflow tf.extract_image_patches

Question

extract_image_patches函數的官方tensorflow文檔說：

tf.extract_image_patches(
    images,
    ksizes,
    strides,
    rates,
    padding,
    name=None
)

我了解了除了rates參數之外的所有必需參數。 這樣做的原因可能是api文檔中給出的解釋：

rates: A list of ints that has length >= 4. 1-D of length 4. 
Must be: [1, rate_rows, rate_cols, 1]. This is the input stride, 
specifying how far two consecutive patch samples are in the input. 
Equivalent to extracting patches with 
patch_sizes_eff = patch_sizes + (patch_sizes - 1) * (rates - 1), 
followed by subsampling them spatially by a factor of rates. This is 
equivalent to rate in dilated (a.k.a. Atrous) convolutions.

這只會使我更加困惑，因為步幅和費率之間有什么區別？ 如果有人可以用一個簡單的例子並用簡單的語言來解釋rates參數是什么，我將不勝感激。 我看到了一些從給定圖像中提取圖像補丁的示例，在所有示例中，使用的值為[1, 1, 1, 1] 。 應該總是1嗎？ 請需要幫助。

Answer 1

該方法的工作原理如下：

ksizes用於確定每個補丁的尺寸，即每個補丁應包含多少像素。
strides表示原始圖像中一個色塊的起點與下一個連續色塊的起點之間的間隙長度。
rates是一個數字，從本質上講，我們的補丁應該針對最終出現在我們補丁中的每個連續像素，以原始圖像中的像素為rates跳變。 （以下示例有助於說明這一點。）
padding要么是“ VALID”（有效），這意味着每個色塊必須完全包含在圖像中，要么是“ SAME”（這意味着色塊不完整（剩余像素將用零填充））。

這是一些示例代碼，帶有輸出以幫助演示其工作方式：

import tensorflow as tf

n = 10
# images is a 1 x 10 x 10 x 1 array that contains the numbers 1 through 100 in order
images = [[[[x * n + y + 1] for y in range(n)] for x in range(n)]]

# We generate four outputs as follows:
# 1. 3x3 patches with stride length 5
# 2. Same as above, but the rate is increased to 2
# 3. 4x4 patches with stride length 7; only one patch should be generated
# 4. Same as above, but with padding set to 'SAME'
with tf.Session() as sess:
  print tf.extract_image_patches(images=images, ksizes=[1, 3, 3, 1], strides=[1, 5, 5, 1], rates=[1, 1, 1, 1], padding='VALID').eval(), '\n\n'
  print tf.extract_image_patches(images=images, ksizes=[1, 3, 3, 1], strides=[1, 5, 5, 1], rates=[1, 2, 2, 1], padding='VALID').eval(), '\n\n'
  print tf.extract_image_patches(images=images, ksizes=[1, 4, 4, 1], strides=[1, 7, 7, 1], rates=[1, 1, 1, 1], padding='VALID').eval(), '\n\n'
  print tf.extract_image_patches(images=images, ksizes=[1, 4, 4, 1], strides=[1, 7, 7, 1], rates=[1, 1, 1, 1], padding='SAME').eval()

輸出：

[[[[ 1  2  3 11 12 13 21 22 23]
   [ 6  7  8 16 17 18 26 27 28]]

  [[51 52 53 61 62 63 71 72 73]
   [56 57 58 66 67 68 76 77 78]]]]


[[[[  1   3   5  21  23  25  41  43  45]
   [  6   8  10  26  28  30  46  48  50]]

  [[ 51  53  55  71  73  75  91  93  95]
   [ 56  58  60  76  78  80  96  98 100]]]]


[[[[ 1  2  3  4 11 12 13 14 21 22 23 24 31 32 33 34]]]]


[[[[  1   2   3   4  11  12  13  14  21  22  23  24  31  32  33  34]
   [  8   9  10   0  18  19  20   0  28  29  30   0  38  39  40   0]]

  [[ 71  72  73  74  81  82  83  84  91  92  93  94   0   0   0   0]
   [ 78  79  80   0  88  89  90   0  98  99 100   0   0   0   0   0]]]]

因此，例如，我們的第一個結果如下所示：

 *  *  *  4  5  *  *  *  9 10 
 *  *  * 14 15  *  *  * 19 20 
 *  *  * 24 25  *  *  * 29 30 
31 32 33 34 35 36 37 38 39 40 
41 42 43 44 45 46 47 48 49 50 
 *  *  * 54 55  *  *  * 59 60 
 *  *  * 64 65  *  *  * 69 70 
 *  *  * 74 75  *  *  * 79 80 
81 82 83 84 85 86 87 88 89 90 
91 92 93 94 95 96 97 98 99 100

如您所見，我們有2行和2列的補丁程序，分別是out_rows和out_cols 。

tensorflow tf.extract_image_patches

問題描述

1 個解決方案

解決方案1
0 2018-02-16 16:55:53

tensorflow tf.extract_image_patches

問題描述

1 個解決方案

解決方案1 0 2018-02-16 16:55:53

解決方案1
0 2018-02-16 16:55:53