I use string_input_producer for a filename of my tfrecord
as an input of tf.TFRecordReader
and feed it into tf.train.shuffle_batch
to make batches of my custom data. Then I create a session, model, etc and 'tf.train.coordinater.'
It worked well during the first iteration of the data, but it stops working at the 2nd iteration.
Say, I have 10,000 rows of data and set the batch size to 100. After 100 training loops, I got this error:
OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)
I set num_epochs
of string_input_producer to None
, so I thought string_input_producer
would feed the same filename again once tf.train.shuffle_batch run out of data.
Am I missing something??
Here is a snippet of my code:
# queue with a list of filenames
filename_queue = tf.train.string_input_producer([filename], name="filename_queue",
shuffle=True, num_epochs=None)
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue)
features = tf.parse_single_example(serialized_example,
features={
'f1': ....
'f2': ....,
})
# batch
batch_size = 100
f1_batch, f1_batch = tf.train.shuffle_batch([f1, f2], batch_size=batch_size, num_threads=10,
capacity=1000 + 3*batch_size,
min_after_dequeue=1000)
#
# ... create a session, build model, optimizer, summary writer etc
#
init = tf.initialize_all_variables()
sess.run(init)
# start queue thread
coord = tf.train.Coordinator()
tf.train.start_queue_runners(sess=sess, coord=coord)
for i in range(NUM_EPOCHS):
# get batch data
x, y = sess.run([f1_batch, f2_batch])
# optimize
sess.run(optimizer, feed_dict={x:x,y:y})
# ... calc loss, write summary ...
when following the official instruction of using tfrecord
, OutOfRangeError
often comes from inadequate elements in the queue or inappropriate data construction or pre-procession.
BTW, I have two problems. 1, insteading of using feed_dict
, I want to know how to define the model in a function which takes the data as Tensor inputs. In this case, I have to construct seperate computation graphs to feed training data as well as testing data. 2, if using feed_dict
, how do you know the end of an epoch? It seems that tf.train.shuffle_batch
will output fixed-length batches until the end. It's hard for me use epoch
(not batch
number) to control the training process.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.