Batching for a non-image data set with Tensorflow

Question

I am a beginner in tensorflow. I have a data set with 43 inputs and one output. I am gonna create a mini-batch of the data to run deep learning.

Here are my inputs:

x = tf.placeholder(tf.float32, shape=[None, 43])
y_ = tf.placeholder(tf.float32, shape=[None])

which I am feeding them from a matlab file, looking:

train_mat = train_mat["binary_train"].value
feed_dict={x:Train[0:100,0:43] , y_:Train[0:100,43]}

I am gonna have random batch instead of calling 0:100 records. I saw

tf.train.batch

but, I could not realize how does it work. Could you please guide me how I can do that.

Thanks, Afshin

Answer 1

The tf.train.batch and other similar methods are based on Queues, which are best fit in parallel loading huge amount of samples asynchronously. The document here describes basic of using queues in TensorFlow. There is also another blog describing how to read data from files .

If you are going to use queues, the placeholder and feed_dict is unnecessary.

For your specific case, the potential solution maybe look like this:

from tensorflow.python.training import queue_runner

# capacity and min_after_dequeue could be set according to your case
q = tf.RandomShuffleQueue(1000, 500, tf.float32)
enq = q.enqueue_many(train_mat)
queue_runner.add_queue_runner(queue_runner.QueueRunner(q, [enq]))

deq = q.dequeue()
input = deq[:, 0:43]
label = deq[:, 43]

x, y_ = tf.train.batch([input, label], 100)

# then you can use x and y_ directly in inference and train process.

Code above is based on some hypothesis, because information provided in question is not sufficient. However, I hope the code could inspire you in some way.

Batching for a non-image data set with Tensorflow

Question

1 answers

solution1
0 2016-08-16 03:47:33

Batching for a non-image data set with Tensorflow

Question

1 answers

solution1 0 2016-08-16 03:47:33

solution1
0 2016-08-16 03:47:33