简体   繁体   中英

Batching for a non-image data set with Tensorflow

I am a beginner in tensorflow. I have a data set with 43 inputs and one output. I am gonna create a mini-batch of the data to run deep learning.

Here are my inputs:

x = tf.placeholder(tf.float32, shape=[None, 43])
y_ = tf.placeholder(tf.float32, shape=[None])

which I am feeding them from a matlab file, looking:

train_mat = train_mat["binary_train"].value
feed_dict={x:Train[0:100,0:43] , y_:Train[0:100,43]}

I am gonna have random batch instead of calling 0:100 records. I saw

tf.train.batch

but, I could not realize how does it work. Could you please guide me how I can do that.

Thanks, Afshin

The tf.train.batch and other similar methods are based on Queues, which are best fit in parallel loading huge amount of samples asynchronously. The document here describes basic of using queues in TensorFlow. There is also another blog describing how to read data from files .

If you are going to use queues, the placeholder and feed_dict is unnecessary.

For your specific case, the potential solution maybe look like this:

from tensorflow.python.training import queue_runner

# capacity and min_after_dequeue could be set according to your case
q = tf.RandomShuffleQueue(1000, 500, tf.float32)
enq = q.enqueue_many(train_mat)
queue_runner.add_queue_runner(queue_runner.QueueRunner(q, [enq]))

deq = q.dequeue()
input = deq[:, 0:43]
label = deq[:, 43]

x, y_ = tf.train.batch([input, label], 100)

# then you can use x and y_ directly in inference and train process.

Code above is based on some hypothesis, because information provided in question is not sufficient. However, I hope the code could inspire you in some way.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM