简体   繁体   English

如何计算从PB文件加载的Tensorflow模型的触发器

[英]How to calculate the flops of a tensorflow model loaded from pb file

I have a model saved in a pb file. 我有一个保存在PB文件中的模型。 I hope to calculate the flops of it. 我希望能计算出它的触发器。 My example code is as follow: 我的示例代码如下:

import tensorflow as tf
import sys
from tensorflow.python.platform import gfile

from tensorflow.core.protobuf import saved_model_pb2
from tensorflow.python.util import compat

pb_file = 'themodel.pb'

run_meta = tf.RunMetadata()
with tf.Session() as sess:
    print("load graph")
    with gfile.FastGFile(pb_path,'rb') as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
        sess.graph.as_default()
        tf.import_graph_def(graph_def, name='')
        flops = tf.profiler.profile(tf.get_default_graph(), run_meta=run_meta,
            options=tf.profiler.ProfileOptionBuilder.float_operation())
        print("test flops:{:,}".format(flops.total_float_ops))

The print information is strange. 打印信息很奇怪。 My model has tens of layers, but it reports only 18 flops in the printed information. 我的模型有几十层,但是在打印的信息中只报告了18个触发器。 I'm quite sure the model is correctly loaded because if I try to print the names of every layer as follows: 我非常确定模型已正确加载,因为如果尝试按以下方式打印每个图层的名称,则:

print([n.name for n in tf.get_default_graph().as_graph_def().node])

The print information shows exactly the right network. 打印信息显示正确的网络。

What's wrong with my code? 我的代码有什么问题?

Thank you! 谢谢!

I think I find the reason and solution for my question. 我想我找到了问题的原因和解决方案。 The following code can print the flops of the given pb file. 以下代码可以打印给定pb文件的触发器。

import os
import tensorflow as tf
from tensorflow.core.framework import graph_pb2
from tensorflow.python.framework import importer

os.environ['CUDA_VISIBLE_DEVICES'] = '0'

pb_path = 'mymodel.pb'

run_meta = tf.RunMetadata()
with tf.Graph().as_default():
    output_graph_def = graph_pb2.GraphDef()
    with open(pb_path, "rb") as f:
        output_graph_def.ParseFromString(f.read())
        _ = importer.import_graph_def(output_graph_def, name="")
        print('model loaded!')
    all_keys = sorted([n.name for n in tf.get_default_graph().as_graph_def().node])
    # for k in all_keys:
    #   print(k)

    with tf.Session() as sess:
        flops = tf.profiler.profile(tf.get_default_graph(), run_meta=run_meta,
            options=tf.profiler.ProfileOptionBuilder.float_operation())
        print("test flops:{:,}".format(flops.total_float_ops))

The reason why the flops printed in the question being merely 18 is that, when generating the pb file, I set the input image shape as [None, None, 3] . 问题中打印的触发器只有18个的原因是,在生成pb文件时,我将输入图像的形状设置为[None, None, 3] If I change it to, say [500, 500, 3] , then the printed flops will be correct. 如果我将其更改为[500, 500, 3] ,那么印刷的拖鞋将是正确的。

Not sure how it would compute any performance measure without knowing the inputs and outputs: maybe it needs CallableOptions ? 不知道在不知道输入和输出的情况下如何计算性能指标:也许它需要CallableOptions I'd use trace_next_step and a Session rather than computing those manually. 我会使用trace_next_step和一个Session而不是手动计算它们。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM