简体   繁体   中英

Using a Tensorflow Pretrained model

I am trying to use a model with pretrained weights from tensorflow

I am a bit lost on how I should load it to generate predictions. I want to make object detections on an image using faster_rcnn model.

For the model faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28 I have the following files :

|   checkpoint
|   frozen_inference_graph.pb
|   model.ckpt.data-00000-of-00001
|   model.ckpt.index
|   model.ckpt.meta
|   pipeline.config
|
\---saved_model
    |   saved_model.pb
    |
    \---variables

Here is my attempt to load the model and generate some predictions :

import tensorflow as tf
import cv2

model_folder = "faster_rcnn_inception_resnet_v2_atrous_lowproposals_oid_2018_01_28"

model_graph_file = model_folder+"/frozen_inference_graph.pb"
model_weights_file = model_folder+"/model.ckpt.data-00000-of-00001"

graph_def = tf.GraphDef()
graph_def.ParseFromString(tf.gfile.Open(model_graph_file,'rb').read())

#print([n.name + '=>' +  n.op for n in graph_def.node if n.op in ('Placeholder')])
#print([n.name + '=>' +  n.op for n in graph_def.node if n.op in ('Softmax')])

input = graph.get_tensor_by_name('image_tensor:0')
classes = graph.get_tensor_by_name('detection_classes:0')
scores = graph.get_tensor_by_name('detection_scores:0')
boxes = graph.get_tensor_by_name('detection_boxes:0')
softmax = graph.get_tensor_by_name('Softmax:0')

my_image = cv2.imread('resources/my_image.jpg')

with tf.Session(graph=graph) as sess:
    classes_out,scores_out,boxes_out,softmax  = sess.run([classes,scores,boxes,softmax],feed_dict={input:[my_image]})
    print(classes_out)
    print(classes_out.shape)
    print(scores_out)
    print(scores_out.shape)
    print(boxes_out)
    print(boxes_out.shape)
    print(softmax)
    print(softmax.shape)

Which print the following :

[[1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.]]
(1, 20)
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
(1, 20)
[[[0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]
  [0. 0. 0. 0.]]]
(1, 20, 4)
[[9.9819970e-01 1.8002436e-03]
 [9.9932957e-01 6.7051285e-04]
 [9.9853170e-01 1.4682930e-03]
 ...
 [9.9990737e-01 9.2630769e-05]
 [9.9939859e-01 6.0135941e-04]
 [9.6443009e-01 3.5569914e-02]]
(115200, 2)

Obviously I am doing something wrong here but I don't know exactly what. How can I know what layers to use as output layers ? How can I retrieve the classes, scores and boxes of my objects ? Am I loading my model correctly ?

EDIT :

Based on Lescurel answer:

For some reason I had to make some changes on the code to run it : tf.saved_model.tag_constants.SERVING -> [tf.saved_model.tag_constants.SERVING]

and

input_tensor = model_signature["inputs"].name -> input_tensor = model_signature.inputs['inputs'].name . (Using tensorflow 1.12)

Now I have some results and I am really happy about it, but for the same image and same model used by Lescurel I have very different outputs :

[array([[0.5936514 , 0.5774365 , 0.519677  , 0.46745843, 0.36366013,
        0.3496253 , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ,
        0.        , 0.        , 0.        , 0.        , 0.        ]],
      dtype=float32), array([[33.,  1., 68., 11., 13.,  7.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
         1.,  1.,  1.,  1.,  1.,  1.,  1.]], dtype=float32), array([6.], dtype=float32), array([[[0.6699049 , 0.68924683, 0.9372702 , 0.78685343],
        [0.21414267, 0.264757  , 0.9868771 , 0.51174635],
        [0.34444967, 0.65146637, 0.70101655, 0.80124986],
        [0.8743748 , 0.7071637 , 0.9687472 , 0.7784833 ],
        [0.7832241 , 0.51456743, 0.9550611 , 0.59617543],
        [0.32543942, 0.6407225 , 0.9539846 , 0.81454873],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ],
        [0.        , 0.        , 0.        , 0.        ]]], dtype=float32)]

Any idea why ?

You loaded the graph structure of the network, but not the trained weights. Because of that, the network is unable to do any meaningful predictions. To load the weights of the graph in tf 1.x, you can refer to the guide

The following code snippet loads the graph and its weights, and perform predictions ( This snippet uses faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco from the model zoo )

import cv2
import tensorflow as tf #tf.1.x

model_dir = "faster_rcnn_inception_resnet_v2_atrous_lowproposals_coco_2018_01_28/saved_model"

img = cv2.imread("/path/to/image.jpg")

with tf.Session() as sess:
    # We load the model and its weights
    # Models from the zoo are frozen, so we use the SERVING tag
    model = tf.saved_model.loader.load(sess, 
                               tf.saved_model.tag_constants.SERVING, 
                               model_dir)
    # we get the model signature
    model_signature = model.signature_def["serving_default"]
    input_tensor = model_signature["inputs"].name
    # getting the name of the outputs
    output_tensor = [v.name for k,v in model_signature.outputs.items() if v.name]
    # running the prediction
    outs = sess.run(output_tensor, feed_dict={input_tensor:[img]})

A sample output on an image : 在此处输入图片说明

>>> outs
[array([[0.9998708 , 0.99963164, 0.9926651 , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ,
         0.        , 0.        , 0.        , 0.        , 0.        ]],
       dtype=float32),
 array([[ 1.,  1., 18.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,  1.,
          1.,  1.,  1.,  1.,  1.,  1.,  1.]], dtype=float32),
 array([3.], dtype=float32),
 array([[[0.35335696, 0.6397857 , 0.96252066, 0.8067749 ],
         [0.25126144, 0.2766906 , 0.97366196, 0.5463176 ],
         [0.7696026 , 0.52089834, 0.9537483 , 0.59052485],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ],
         [0.        , 0.        , 0.        , 0.        ]]], dtype=float32)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM