How to avoid the out of range error using shuffle_batch() function?

Question

Updated Question

I am trying to match labels with images in tensorflow with the shuffle_batch() function, but there is an out of range error of RandomShuffleQueue when I startd trainning loop with shuffle_batch() function.

1. My updated question

How to avoid the queue out of range error when using shuffle_batch function?

2.My updated code

The following code works well for the first 90 step or so with increasing accuracy, untill it invokes an error.

# Global Parameters

# Image Size
training_size = 1387
img_height = 64
img_width = 64

# File stream
batch_size = 128

# Training parameter
learning_rate = 0.001
training_iters = 100
keep_prob = 0.5 #dropout keep prob
display_step = 10

AdamOptimizer = 1
GradientDescentOptimizer = 0

# Filepath
csv_filepath = r'C:/Users/Jeffy/OneDrive/Course\NMDA\retinaProject\label.csv'
image_filepath = 'Image_P/'


import tensorflow as tf
# =============================================================================
# Read input data

# load csv content
csv_path = tf.train.string_input_producer(['label_3D.csv'])
textReader = tf.TextLineReader()
_, csv_content = textReader.read(csv_path)
im_name, col_2, col_3, col_4 = tf.decode_csv(csv_content, record_defaults=[[""], [1], [1], [1]])
label = tf.pack([col_2, col_3, col_4])
label_float32 = tf.cast(label, tf.float32)

# load images
im_content = tf.read_file(image_filepath + im_name+'.jpeg')
image = tf.image.decode_jpeg(im_content, channels=3)
image_float32 = tf.cast(image, tf.float32)/255

# Generate Batch
batch_shape = ((img_height, img_width, 3),(3))
images_batch, labels_batch = tf.train.shuffle_batch([image_float32, label_float32], 
                                                    batch_size = batch_size, 
                                                    capacity = batch_size * 50, 
                                                    min_after_dequeue = batch_size * 10, 
                                                    shapes = batch_shape)

# =============================================================================
# Construct Network
# define functions
def weight_varible(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')


# paras
W_conv1 = weight_varible([5, 5, 3, 32])
b_conv1 = bias_variable([32])

# conv layer-1
h_conv1 = tf.nn.relu(conv2d(images_batch, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# conv layer-2
W_conv2 = weight_varible([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# full connection
W_fc1 = weight_varible([16 * 16 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 16 * 16 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# dropout

h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# output layer: softmax
W_fc2 = weight_varible([1024, 3])
b_fc2 = bias_variable([3])


y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)

# model training
cross_entropy = -tf.reduce_sum(labels_batch * tf.log(y_conv))

if GradientDescentOptimizer:
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
else:
    if AdamOptimizer:
        train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.arg_max(y_conv, 1), tf.arg_max(labels_batch, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Start file queue
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    sess.run([images_batch, labels_batch])

    coord.request_stop()
    coord.join(threads)

    for i in range(training_iters):
        # display the result on console
        if i % display_step == 0:
            train_accuacy = accuracy.eval()
            print("step %d, training accuracy %g"%(i, train_accuacy))
        # run the model
        train_step.run()
    print("test accuracy %g"%(accuracy.eval()))

3.Updated running result

The updated code invokes an error on step 90 or so:

step 0, training accuracy 0.5625
step 10, training accuracy 0.6875
step 20, training accuracy 0.703125
step 30, training accuracy 0.625
step 40, training accuracy 0.65625
step 50, training accuracy 0.6875
step 60, training accuracy 0.6875
step 70, training accuracy 0.734375
step 80, training accuracy 0.632812
step 90, training accuracy 0.695312

then

OutOfRangeError: RandomShuffleQueue '_24_shuffle_batch_3/random_shuffle_queue' is closed and has insufficient elements (requested 128, current size 1)
     [[Node: shuffle_batch_3 = QueueDequeueMany[_class=["loc:@shuffle_batch_3/random_shuffle_queue"], component_types=[DT_FLOAT, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch_3/random_shuffle_queue, shuffle_batch_3/n)]]

Caused by op 'shuffle_batch_3', defined at:
  File "C:\Users\Jeffy\Anaconda3\lib\site-packages\spyder\utils\ipython\start_kernel.py", line 223, in <module>
    main()
......

Old Question:

Why the label of the image is stuck to the first sample when I am matching labels with images using tf.batch()? My old question is solved with great help of Standy and My friend Dong. Thx them a lot!

1.My questiones

Where to put the 'shuffle_batch()'function, on position1 or position2? (you can find it in with tf.session() part of the code below)
How to fix the code to make batch sucessfully combine the image and the label, not stucking on the first example?

2.csv file

The image name and the relevant label are stored in the label.csv in the following format(withour header):

11219_right,0,1,0
15502_left,0,0,0
14481_right,0,1,0
11032_right,0,0,1
19322_right,0,0,0
......

3.Origin code

The purpose of the code is to use RNN as a image classifier.

The CNN structure is based on the tensorflow example file.

You can focus on the Reading data part and lauch the graph part.

# Global Parameters

# Image Size
training_size = 1387
img_height = 64
img_width = 64

# File stream
batch_size = 128

# Training parameter
learning_rate = 0.001
training_iters = 100
keep_prob = 0.5 #dropout keep prob
display_step = 10

AdamOptimizer = 1
GradientDescentOptimizer = 0

# Filepath
csv_filepath = r'C:/Users/Jeffy/OneDrive/Course\NMDA\retinaProject\label.csv'
image_filepath = 'Image_P/'


# import library
import tensorflow as tf
import numpy as np
#=============================================================================
# Read input data

# load csv content
csv_path = tf.train.string_input_producer(['label.csv'])
textReader = tf.TextLineReader()
_, csv_content = textReader.read(csv_path)
im_name, label = tf.decode_csv(csv_content, record_defaults=[[""], [1]])

# load images
im_content = tf.read_file(image_filepath + im_name+'.jpeg')
image = tf.image.decode_jpeg(im_content, channels=3)

def label_3D (label_num):
    label_3D = np.zeros(3)
    if label_num == 0:
        label_3D[0] = 1
    else:
        if label_num == 3:
            label_3D[1] = 1
        else: # label_num == 4
            label_3D[2] = 1
    return label_3D
# =============================================================================
# Construct Network(you can skip this part)

# define functions
def weight_varible(shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

def conv2d(x, W):
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')

def max_pool_2x2(x):
    return tf.nn.max_pool(x, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME')


# paras
W_conv1 = weight_varible([5, 5, 3, 32])
b_conv1 = bias_variable([32])

# conv layer-1
x = tf.Variable(tf.zeros([batch_size, img_width, img_height, 3]))
h_conv1 = tf.nn.relu(conv2d(x, W_conv1) + b_conv1)
h_pool1 = max_pool_2x2(h_conv1)

# conv layer-2
W_conv2 = weight_varible([5, 5, 32, 64])
b_conv2 = bias_variable([64])

h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
h_pool2 = max_pool_2x2(h_conv2)

# full connection
W_fc1 = weight_varible([16 * 16 * 64, 1024])
b_fc1 = bias_variable([1024])

h_pool2_flat = tf.reshape(h_pool2, [-1, 16 * 16 * 64])
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, W_fc1) + b_fc1)

# dropout

h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)

# output layer: softmax
W_fc2 = weight_varible([1024, 3])
b_fc2 = bias_variable([3])


y_conv = tf.nn.softmax(tf.matmul(h_fc1_drop, W_fc2) + b_fc2)
y_ = tf.Variable(tf.zeros([batch_size, 3]))

# model training
cross_entropy = -tf.reduce_sum(y_ * tf.log(y_conv))
if GradientDescentOptimizer:
    train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(cross_entropy)
else:
    if AdamOptimizer:
        train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)

correct_prediction = tf.equal(tf.arg_max(y_conv, 1), tf.arg_max(y_, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# ==========================================================================
# Lauch the graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Start file queue
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(coord=coord)

    images, labels = sess.run([image, label])
    #position1 for tf.train.shuffle_batch() function

    coord.request_stop()
    coord.join(threads)

    for i in range(training_iters):
        #position2 for tf.train.shuffle_batch() function
        batch = tf.train.shuffle_batch([images,label_3D(labels)], batch_size=batch_size,
                               capacity = batch_size * 50, 
                               min_after_dequeue = batch_size * 10,
                               num_threads = 1)

        if i % display_step == 0:
            x = batch[0]
            y_ = batch[1]
            train_accuacy = accuracy.eval()
            print("step %d, training accuracy %g"%(i, train_accuacy))
        x= batch[0]
        y_ = batch[1]
        train_step.run()

`

4.Running result - In IPython console
step 0, training accuracy 0.226562 step 10, training accuracy 1 step 20, training accuracy 1 step 30, training accuracy 1 step 40, training accuracy 1 - In varible explore

The variable labels and images are unchange as the label and image first example, which is stored in the first line of the label.csv file

Therefore, I infer that the reading file queue has stuck on the first line, causing the CNN quickly converge with the accuracy of 1.

Answer 1

shuffle_batch accepts tensors and returns a tensor, so it is a tensorflow op and should be placed in Graph . shuffle_batch docs

I would place it right after you decode single image:

image = tf.image.decode_jpeg(im_content, channels=3)
images_batch, labels_batch = tf.train.shuffle_batch([image, label], batch_size, batch_size * 50, batch_size * 10)
# images_batch is now Tensor of shape (batch_size, height, weight, channels)
...
h_conv1 = tf.nn.relu(conv2d(images_batch, W_conv1) + b_conv1)

You don't need variables x and y_ now, and you do not need to manually assign your inputs when you are using tf.train.shuffle_batch .

It may seem counter-intuitive that tf.train.shuffle_batch accepts tensors for single example and produces the whole batch, but remember, that the tensors that you feed to this op are from queue so tf.train.shuffle_batch can "wait" for multiple elements (under the hood it actually uses another queue to do shuffling and storing intermediate elements, shuffle_batch implementation is here )

How to avoid the out of range error using shuffle_batch() function?

Question

Updated Question

1. My updated question

2.My updated code

3.Updated running result

Old Question:

1 answers

solution1
0 ACCPTED 2017-01-07 10:14:21

How to avoid the out of range error using shuffle_batch() function?

Question

Updated Question

1. My updated question

2.My updated code

3.Updated running result

Old Question:

1 answers

solution1 0 ACCPTED 2017-01-07 10:14:21

solution1
0 ACCPTED 2017-01-07 10:14:21