简体   繁体   English

使用张量流训练大量数据

[英]Training huge amounts of data with tensorflow

I have about 60 thousand samples of size 200x870, they are all numpy arrays and I want to build a four-dimensional tensor out of them (with one singleton dimension) and train them with a CNN in tensorflow. 我有大约6万个大小为200x870的样本,它们都是numpy数组,我想从它们中构建一个四维张量(具有一个单例维),并使用Tensorflow中的CNN训练它们。 Up to this point, I was using data that I could just load and create batches as below: 到目前为止,我正在使用可以加载和创建批处理的数据,如下所示:

 with tf.Graph().as_default(): 
     data_train = tf.to_float(getInput.data_train)
     phase, lr = tf.placeholder(tf.bool),  tf.placeholder(tf.float32)
     global_step = tf.Variable(0,trainable = False)
     image_train, label_train = tf.train.slice_input_producer([data_train, labels_train], num_epochs=args.num_epochs)
     images_train, batch_labels_train = tf.train.batch([image_train, label_train], batch_size=args.bsize) 

Can someone suggest a way to go around it? 有人可以建议一种解决方法吗?

I wanted to split the dataset into subsets and in one epoch train one after the ather using a Queue for the paths of this files: 我想将数据集拆分为子集,并在一个时期之后使用Queue作为此文件的路径进行训练:

import scipy.io as sc
import numpy as np
import threading
import time

import tensorflow as tf
from tensorflow.python.client import timeline

def testQueues():

    paths = ['data1', 'data2', 'data3', 'data4','data5']
    queue_capacity = 6
    bsize = 10
    num_epochs = 2
    filename_queue = tf.FIFOQueue(
        #min_after_dequeue=0,
        capacity=queue_capacity,
        dtypes=tf.string,
        shapes=[[]]
    )
    filenames_placeholder = tf.placeholder(dtype='string', shape=(None))
    filenames_enqueue_op = filename_queue.enqueue_many(filenames_placeholder)
    data_train, phase  = tf.placeholder(tf.float32), tf.placeholder(tf.bool)




    sess= tf.Session()
    sess.run(filenames_enqueue_op, feed_dict={filenames_placeholder: paths})

    for i in range(len(paths)):



        train_set_batch_name = sess.run(filename_queue.dequeue())
        train_set_batch_name = train_set_batch_name.decode('utf-8')
        train_set_batch = np.load(train_set_batch_name+'.npy')
        train_set_batch = tf.cast(train_set_batch, tf.float32)
        init_op = tf.group(tf.initialize_all_variables(), tf.initialize_local_variables())
        sess.run(init_op)
        run_one_epoch(train_set_batch, sess)

        size = sess.run(filename_queue.size())
        print(size)
        print(train_set_batch)


def run_one_epoch(train_set,sess):
    image_train = tf.train.slice_input_producer([train_set], num_epochs=1)
    images_train = tf.train.batch(image_train, batch_size=10)
    x = tf.nn.relu(images_train)
    coord = tf.train.Coordinator()
    threads = tf.train.start_queue_runners(sess=sess, coord=coord)
    try:
        while not coord.should_stop():
            sess.run(x)
    except tf.errors.OutOfRangeError:
        pass
    finally:
      # When done, ask the threads to stop.
        coord.request_stop()        
        coord.join(threads)


testQueues()

However I get an error 但是我得到一个错误

FailedPreconditionError: Attempting to use uninitialized value input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs
     [[Node: input_producer/input_producer/fraction_of_32_full/limit_epochs/CountUpTo = CountUpTo[T=DT_INT64, _class=["loc:@input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs"], limit=1, _device="/job:localhost/replica:0/task:0/cpu:0"](input_producer/input_producer/fraction_of_32_full/limit_epochs/epochs)]]

Also it seems as I can't feed the dictionary with a tf.tensor only with numpy array, but casting it later to tf.tensor is also troublesome. 同样,似乎我不能仅用numpy数组给tf.tensor喂字典,但是稍后将其强制转换为tf.tensor也很麻烦。

Have a look at Dataset api . 看一下Dataset api "The tf.data API enables you to build complex input pipelines from simple, reusable pieces." “ tf.data API使您能够从简单,可重用的部分构建复杂的输入管道。”

In this approach what you do is you model your graph such that it handles data for you and pulls in limited data at a time for you to train your model on. 在这种方法中,您要做的是对图形进行建模,使其能够为您处理数据并一次提取有限的数据,以供您训练模型。

If memory issue still persists then you might want to look into generator to create your tf.data.Dataset. 如果内存问题仍然存在,则您可能需要查看生成器来创建tf.data.Dataset。 Your next step could be to potentially speed up the process by preparing tfrecords to create you Dataset. 下一步可能是通过准备tfrecords来创建数据集来潜在地加快此过程。

Follow all the links to learn more and feel free to comment if you don't understand something. 请点击所有链接以了解更多信息,如果您不了解某些内容,请随时发表评论。

For data that doesn't fit into memory the standard solution is to use Queues. 对于不适合内存的数据,标准解决方案是使用队列。 You can set up some ops that read from files directly (cvs files, image files), and feed them into TensorFlow -- https://www.tensorflow.org/versions/r0.11/how_tos/reading_data/index.html 您可以设置一些直接从文件(CVS文件,图像文件)读取的操作,并将其输入TensorFlow- https: //www.tensorflow.org/versions/r0.11/how_tos/reading_data/index.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM