I am trying to use shuffle.batch
to batch the data for training that loaded from .csv file. However, when I am running the code, it seems does not work. It did not show any error, but did not finish.
So, could you suggest to me what is wrong with my code?
Moreover, what is a suitable value for capacity and min_after_dequeue
?
import tensorflow as tf
import numpy as np
test_label = []
in_label = []
iris_TRAINING = "iris_training.csv"
iris_TEST = "iris_test.csv"
# Load datasets.
training_set = tf.contrib.learn.datasets.base.load_csv_with_header(filename=iris_TRAINING, target_dtype=np.int, features_dtype=np.float32)
test_set = tf.contrib.learn.datasets.base.load_csv_with_header(filename=iris_TEST, target_dtype=np.int, features_dtype=np.float32)
x_train, x_test, y_train, y_test = training_set.data, test_set.data, training_set.target, test_set.target
for n in y_train:
targets = np.zeros(3)
targets[int(n)] = 1 # one-hot pixs[0] is label and then use that number as index of one-hot
in_label.append(targets) #store all of label (one-hot)
training_label = np.asarray(in_label)
for i in y_test:
test_targets = np.zeros(3)
test_targets[int(i)] = 1 # one-hot pixs[0] is label and then use that number as index of one-hot
test_label.append(test_targets)
test_label = np.asarray(test_label)
x = tf.placeholder(tf.float32, [None,4]) #generate placeholder to store value of features for training
W = tf.Variable(tf.zeros([4, 3])) #weight
b = tf.Variable(tf.zeros([3])) #bias
y = tf.matmul(x, W) + b
y_ = tf.placeholder(tf.float32, [None, 3]) #generate placeholder to store value of labels
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(y, y_))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
sess = tf.InteractiveSession()
# Train
tf.initialize_all_variables().run()
for i in range(5):
batch_xt, batch_yt = tf.train.shuffle_batch([x_train,training_label],batch_size=10,capacity=200,min_after_dequeue=10)
sess.run(train_step, feed_dict={x: batch_xt.eval(), y_: batch_yt.eval()})
print(i)
# Test trained model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: x_test, y_: test_label}))
Shuffle_batch build :
(see here for more details)
So you don't need to call Shuffle_batch at each iteration but only one time before your loop. And you have to call tf.train.start_queue_runners()
after. So the end of your code should be something like :
sess = tf.InteractiveSession()
# Train
tf.initialize_all_variables().run()
batch_xt, batch_yt = tf.train.shuffle_batch([x_train,training_label],batch_size=10,capacity=200,min_after_dequeue=10)
tf.train.start_queue_runners()
for i in range(5):
sess.run(train_step, feed_dict={x: batch_xt.eval(), y_: batch_yt.eval()})
print(i)
# Test trained model
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={x: x_test, y_: test_label}))
Suitable values for capacity and min_after_dequeue depend of your available memory and I/O throughput. Capacity limits the place taken in memory of your dataset. They just could impact the computation time but not the final result (See here for more details).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.