Passing Input Pipeline to TensorFlow Estimator

Question

I'm a noob to TF so go easy on me.

I have to train a simple CNN from a bunch of images in a directory with labels. After looking around a lot, I cooked up this code that prepares a TF input pipeline and I was able to print the image array.

    image_list, label_list = load_dataset()

    imagesq = ops.convert_to_tensor(image_list, dtype=dtypes.string)
    labelsq = ops.convert_to_tensor(label_list, dtype=dtypes.int32)

    # Makes an input queue
    input_q = tf.train.slice_input_producer([imagesq, labelsq],
                                                shuffle=True)

    file_content = tf.read_file(input_q[0])
    train_image = tf.image.decode_png(file_content,channels=3)
    train_label = input_q[1]

    train_image.set_shape([120,120,3])

    # collect batches of images before processing
    train_image_batch, train_label_batch = tf.train.batch(
        [train_image, train_label],
        batch_size=5
        # ,num_threads=1
    )

    with tf.Session() as sess:
        # initialize the variables
        sess.run(tf.global_variables_initializer())
        # initialize the queue threads to start to shovel data
        coord = tf.train.Coordinator()
        threads = tf.train.start_queue_runners(coord=coord)
        # print "from the train set:"
        for i in range(len(image_list)):
             print sess.run(train_image_batch)
        # sess.run(train_image)
        # sess.run(train_label)
        # classifier.fit(input_fn=lambda: (train_image, train_label),
        #                steps=100,
        #                monitors=[logging_hook])

        # stop our queue threads and properly close the session
        coord.request_stop()
        coord.join(threads)
        sess.close()

But looking at the MNIST example given in TF docs, I see they use a cnn_model_fn along with Estimator class.

I have defined my own cnn_model_fn and would like to combine the two. Please help me on how to move forward with this. This code doesn't work

classifier = learn.Estimator(model_fn=cnn_model_fn, model_dir='./test_model')
classifier.fit(input_fn=lambda: (train_image, train_label),
steps=100,
monitors=[logging_hook])

It seems the pipeline is populated only when the session is run, otherwise its empty and it gives a ValueError 'Input graph and Layer graph are not the same'

Please help me.

Answer 1

I'm new to tensorflow myself so take this with a grain of salt.

AFAICT, when you call any of the tf APIs that create "tensors" or "operations" they are created into a context called a Graph .

Further, I believe when the Estimator runs it creates a new empty Graph for each run. It populates the Graph by running model_fn and input_fn that are supposed to call tf APIs that add the "tensors" and "operations" in context of this fresh Graph .

The return values from model_fn and input_fn just provide references so that the parts could be connected correctly - the Graph already contains them.

However in this example the input operations have already been created before the Estimator created the Graph and thus their related operations have been added to the implicit default Graph (one is created automatically I believe). So when the Estimator creates a new one and populates the model with model_fn the input and model will be on two different graphs.

To fix this you need to change the input_fn . Don't just wrap the (image, labels) pair into a lambda but rather wrap the entire construction of the input into a function so that when the Estimator runs input_fn as a side effect of all the API calls all the input operations and tensors would be created in context of the correct Graph .

Passing Input Pipeline to TensorFlow Estimator

Question

1 answers

solution1
2 2017-07-05 21:08:57

Passing Input Pipeline to TensorFlow Estimator

Question

1 answers

solution1 2 2017-07-05 21:08:57

solution1
2 2017-07-05 21:08:57