简体   繁体   中英

OutOfRangeError : RandomShuffleQueue : Working with tensorflow's shuffle_batch

I use string_input_producer for a filename of my tfrecord as an input of tf.TFRecordReader and feed it into tf.train.shuffle_batch to make batches of my custom data. Then I create a session, model, etc and 'tf.train.coordinater.'

It worked well during the first iteration of the data, but it stops working at the 2nd iteration.

Say, I have 10,000 rows of data and set the batch size to 100. After 100 training loops, I got this error:

OutOfRangeError: RandomShuffleQueue '_2_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 100, current size 0)

I set num_epochs of string_input_producer to None , so I thought string_input_producer would feed the same filename again once tf.train.shuffle_batch run out of data.

Am I missing something??

Here is a snippet of my code:

# queue with a list of filenames 
filename_queue = tf.train.string_input_producer([filename], name="filename_queue", 
                                                      shuffle=True, num_epochs=None) 
reader = tf.TFRecordReader()

_, serialized_example = reader.read(filename_queue) 
features = tf.parse_single_example(serialized_example,
          features={
                'f1': ....
                'f2': ....,
          })

# batch
batch_size = 100
f1_batch, f1_batch = tf.train.shuffle_batch([f1, f2], batch_size=batch_size, num_threads=10,
                                              capacity=1000 + 3*batch_size,
                                              min_after_dequeue=1000) 

# 
# ... create a session, build model, optimizer, summary writer etc
# 

init = tf.initialize_all_variables()
sess.run(init)    

# start queue thread
coord = tf.train.Coordinator()
tf.train.start_queue_runners(sess=sess, coord=coord)

for i in range(NUM_EPOCHS): 
    # get batch data    
    x, y = sess.run([f1_batch, f2_batch])        

    # optimize
    sess.run(optimizer, feed_dict={x:x,y:y})

    # ... calc loss, write summary ...

when following the official instruction of using tfrecord , OutOfRangeError often comes from inadequate elements in the queue or inappropriate data construction or pre-procession.

BTW, I have two problems. 1, insteading of using feed_dict , I want to know how to define the model in a function which takes the data as Tensor inputs. In this case, I have to construct seperate computation graphs to feed training data as well as testing data. 2, if using feed_dict , how do you know the end of an epoch? It seems that tf.train.shuffle_batch will output fixed-length batches until the end. It's hard for me use epoch (not batch number) to control the training process.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM