简体   繁体   中英

Using sparse data generator with Keras/Tensorflow

I have implemented a network in C++ using the CPU and I am trying to train it using the GPU together with python. The problem I face is that the input is very large (and sparse) with about 50000 input neurons where usually only 30 are activated.

My model looks like this:

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
input_1 (InputLayer)            (None, 24576)        0                                            
__________________________________________________________________________________________________
input_2 (InputLayer)            (None, 24576)        0                                            
__________________________________________________________________________________________________
dense_1 (Dense)                 (None, 256)          6291712     input_1[0][0]                    
__________________________________________________________________________________________________
dense_2 (Dense)                 (None, 256)          6291712     input_2[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_1 (LeakyReLU)       (None, 256)          0           dense_1[0][0]                    
__________________________________________________________________________________________________
leaky_re_lu_2 (LeakyReLU)       (None, 256)          0           dense_2[0][0]                    
__________________________________________________________________________________________________
concatenate_1 (Concatenate)     (None, 512)          0           leaky_re_lu_1[0][0]              
                                                                 leaky_re_lu_2[0][0]              
__________________________________________________________________________________________________
dense_3 (Dense)                 (None, 32)           16416       concatenate_1[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_3 (LeakyReLU)       (None, 32)           0           dense_3[0][0]                    
__________________________________________________________________________________________________
dense_4 (Dense)                 (None, 32)           1056        leaky_re_lu_3[0][0]              
__________________________________________________________________________________________________
leaky_re_lu_4 (LeakyReLU)       (None, 32)           0           dense_4[0][0]                    
__________________________________________________________________________________________________
dense_5 (Dense)                 (None, 1)            33          leaky_re_lu_4[0][0]              
==================================================================================================
Total params: 12,600,929
Trainable params: 12,600,929
Non-trainable params: 0

I also got about 300 million input/output values I am trying to feed into my network. Needless to say, the data is way too much to fit onto my GPU all at once.

For speed purpose, I generate sparse matrices each which represent about 100000 inputs and saved them in memory (about 50Gb). I can easily load them without losing much speed like this:

# loads both the inputs and the output for the given chunk (100000 inputs/outputs) from the memory
trainX1,trainX2,trainY = readNumpyChunkAndCreateInput(chunk)

I use this to train my network like this:

for chunk in chunks:
        trainX1,trainX2,trainY = readNumpyChunkAndCreateInput(chunk)

        _res = model.fit([trainX1,trainX2], trainY, epochs=1,steps_per_epoch=1,verbose=0)
        loss = list(_res.history.values())[0]
        totalLoss += loss[0]

Clearly this is not optimal by any means. I know that there is something called data generators in Keras/TensorFlow but sadly I do not know how to use them for my specific case because all tutorials deal with dense inputs. I am very happy if someone could help me out here!

Greetings, Finn

Edit 1

The way I load the data:

filePath = os.path.abspath(os.path.dirname(sys.argv[0]))
    path = filePath + "\\data\\" + name + "\\"

    indices1 = np.load(path + 'indices1.npy')
    indices2 = np.load(path + 'indices2.npy')
    outputs = np.load(path + 'outputs.npy')

    meta = open(path + 'meta.txt', "r")
    metaInf = meta.readlines()[0].split(" ")
    meta.close()

    entry1Count = int(metaInf[0])
    entry2Count = int(metaInf[1])
    lineCount = int(metaInf[2])

    values1 = tf.ones(entry1Count)
    values2 = tf.ones(entry2Count)

    shape = (lineCount, 6 * 64 * 64)

    trainX1 = tf.SparseTensor(
        indices=indices1,
        values=values1,
        dense_shape=shape
    )

    trainX2 = tf.SparseTensor(
        indices=indices2,
        values=values2,
        dense_shape=shape
    )

    return trainX1, trainX2, outputs

I have written a small generator function that you can adapt to your use case.

import os
def gen():
    paths = os.listdir('temp_data') # path of the directory
    for path in paths:
        file_path = os.path.join('temp_data',path)
        x = np.load(file_path)
        y = np.load(file_path),
        z = np.load(file_path)
        # Your logic
        #
        #
        #
        
        yield (x,y,z) # Three tensors/numpy arrays. In your case trainx1, trainx2, outputs.

Code for using generator in tf.data.Dataset:

dataset = tf.data.Dataset.from_generator(gen, (tf.float32, tf.float32,tf.float32))
dataset = dataset.prefetch(2)

The prefetch allows the next batch to be stored in advance in order to remove any delays. You can use this dataset to pass to your fit command or use custom training loop like this.

epochs = 100
for epoch in range(epochs):
    print("\nStart of epoch %d" % (epoch,))

    # Iterate over the batches of the dataset.
    for step, (x1_batch_train, x2_batch_train, y_batch_train) in enumerate(train_dataset):

        # Open a GradientTape to record the operations run
        # during the forward pass, which enables auto-differentiation.
        with tf.GradientTape() as tape:

            # Run the forward pass of the layer.
            # The operations that the layer applies
            # to its inputs are going to be recorded
            # on the GradientTape.
            logits = model([x1_batch_train,x2_batch_train], training=True)  # Logits for this minibatch

            # Compute the loss value for this minibatch.
            loss_value = loss_fn(y_batch_train, logits)

        # Use the gradient tape to automatically retrieve
        # the gradients of the trainable variables with respect to the loss.
        grads = tape.gradient(loss_value, model.trainable_weights)

        # Run one step of gradient descent by updating
        # the value of the variables to minimize the loss.
        optimizer.apply_gradients(zip(grads, model.trainable_weights))

        # Log every 200 batches.
        if step % 200 == 0:
            print(
                "Training loss (for one batch) at step %d: %.4f"
                % (step, float(loss_value))
            )
            print("Seen so far: %s samples" % ((step + 1) * 64))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM