简体   繁体   中英

TensorFlow: shuffle_batch did not show any error but did not finish

I am trying to use shuffle.batch to batch the data for training that loaded from .csv file. However, when I am running the code, it seems does not work. It did not show any error, but did not finish.

So, could you suggest to me what is wrong with my code?

Moreover, what is a suitable value for capacity and min_after_dequeue ?

import tensorflow as tf
import numpy as np


test_label = []
in_label = []

iris_TRAINING = "iris_training.csv"
iris_TEST = "iris_test.csv"

# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(filename=iris_TRAINING, target_dtype=np.int, features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(filename=iris_TEST, target_dtype=np.int, features_dtype=np.float32)

x_train, x_test, y_train, y_test = training_set.data, test_set.data, training_set.target, test_set.target



for n in y_train:
    targets = np.zeros(3) 
    targets[int(n)] = 1 #  one-hot pixs[0] is label and then use that number as index of one-hot
    in_label.append(targets)  #store all of label (one-hot) 
training_label = np.asarray(in_label)

for i in y_test:    
    test_targets = np.zeros(3) 
    test_targets[int(i)] = 1  # one-hot pixs[0] is label and then use that number as index of one-hot
    test_label.append(test_targets) 
test_label = np.asarray(test_label)


x = tf.placeholder(tf.float32, [None,4])   #generate placeholder to store value of features for training

W = tf.Variable(tf.zeros([4, 3])) #weight
b = tf.Variable(tf.zeros([3]))  #bias

y = tf.matmul(x, W) + b

y_ = tf.placeholder(tf.float32, [None, 3])  #generate placeholder to store value of labels


cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)


sess = tf.InteractiveSession()
# Train
tf.initialize_all_variables().run()

for i in range(5):
    batch_xt, batch_yt = tf.train.shuffle_batch([x_train,training_label],batch_size=10,capacity=200,min_after_dequeue=10)
    sess.run(train_step, feed_dict={x: batch_xt.eval(), y_: batch_yt.eval()})  
    print(i)

# Test trained model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


print(sess.run(accuracy, feed_dict={x: x_test, y_: test_label}))

Shuffle_batch build :

  1. a queue Q into which batch of your dataset will be enqueue
  2. an operation to dequeue Q and get a batch
  3. a QueueRunner to enqueue Q

(see here for more details)

So you don't need to call Shuffle_batch at each iteration but only one time before your loop. And you have to call tf.train.start_queue_runners() after. So the end of your code should be something like :

sess = tf.InteractiveSession()
# Train
tf.initialize_all_variables().run()
batch_xt, batch_yt = tf.train.shuffle_batch([x_train,training_label],batch_size=10,capacity=200,min_after_dequeue=10)
tf.train.start_queue_runners()

for i in range(5):
    sess.run(train_step, feed_dict={x: batch_xt.eval(), y_: batch_yt.eval()})  
    print(i)

# Test trained model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))


print(sess.run(accuracy, feed_dict={x: x_test, y_: test_label}))

Suitable values for capacity and min_after_dequeue depend of your available memory and I/O throughput. Capacity limits the place taken in memory of your dataset. They just could impact the computation time but not the final result (See here for more details).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM