使用Inception V3在Tensorflow中定位對象

Question

我看過這篇博客文章，描述了如何使用Google的圖像分類模型Inception V3在圖像中定位對象。

“我們可以將8x8x2048表示形式解釋為特征網格，將圖像分解為8個水平和8個垂直網格正方形。”

誰能解釋我如何在python中訪問Inception的8x8x2048層？ 然后使用1x1卷積將這些向量的每一個映射到類標簽？

謝謝！

Answer 1

tensorflow回購中的inception模型調用了inception.slim.inception_v3函數，您需要在此處修改網絡以為1x1卷積增加一層。

更改將非常小，您可以按照其構造其他層的方式進行操作。 為簡單起見，該層類似於：

net = ops.conv2d(net, 2048, [1, 1])

Answer 2

我發現你可以得到8x8x2048

    with tf.Session(config=config) as sess:
        tensor = sess.graph.get_tensor_by_name('mixed_10/join:0')
        for image_to_test in os.listdir(directory):
            image = os.path.join(directory, image_to_test)
            with tf.gfile.FastGFile(image, 'rb') as f:
                image_data = f.read()
                decoded={'DecodeJpeg/contents:0': image_data}
                predictions = sess.run(tensor, decoded)

預測現在具有8x8x2048

但是我還沒有弄清楚如何從“ 2048”值中獲取一個類

我在嘗試

import tensorflow.contrib.slim as slim

predictions = sess.run(tensor, decoded)
ppp= slim.conv2d(predictions,2048,[1,1])
x=tf.unstack(ppp)

但這會返回張量

Tensor（“ Conv / Relu：0”，shape =（1，8，8，2048），dtype = float32，device = / device：CPU：0）

[<tf.Tensor'unstack：0'shape =（8，8，2048）dtype = float32>]

使用Inception V3在Tensorflow中定位對象

問題描述

2 個解決方案

解決方案1
5 已采納 2016-11-17 06:36:34

解決方案2
0 2017-06-09 15:02:19

使用Inception V3在Tensorflow中定位對象

問題描述

2 個解決方案

解決方案1 5 已采納 2016-11-17 06:36:34

解決方案2 0 2017-06-09 15:02:19

解決方案1
5 已采納 2016-11-17 06:36:34

解決方案2
0 2017-06-09 15:02:19