简体   繁体   中英

How to use Keras TimeseriesGenerator to take a validation sample for every n training samples?

I'm working on a time series forecasting problem using Keras library for neural networks. I'm trying to split the training set into actual training and validation sets. I don't want to take all the validation data from the end of my set, but rather take 1 validation test for every 5 training samples.

I've managed to create two generators

training_sequence = TimeseriesGenerator(train_x, train_y, length=w, sampling_rate=1, batch_size=batch_s)
validation_sequence = TimeseriesGenerator(train_x, train_y, length=w, sampling_rate=1, stride=5, batch_size=batch_s)

And I would use them for training like:

history = model.fit_generator(training_sequence, validation_sequence, epochs=200, callbacks=[early_stopping_monitor], verbose=1)  

Now, I'm obtaining the correct sequence for validation, but I can't figure out how to get these samples out of the training sequence (so that it's not validating on data it has already trained on).

I've tried processing the training generator in a wrapper, like so:

def get_generator(data, targets, length, batch_size):
    data_gen = TimeseriesGenerator(data, targets, length=length, 
                                   sampling_rate=1, batch_size=batch_size)
    for i in range(len(data_gen)):
        if i % 5 != 0:
            x, y = data_gen[i]
            yield x, y

But when I run the code, I get this error:

ValueError: `steps_per_epoch=None` is only valid for a generator based on the `keras.utils.Sequence` class. Please specify `steps_per_epoch` or use the `keras.utils.Sequence` class.

If I add

steps_per_epoch=len(train_x)/batch_s

I get a "StopIteration" error.

"StopIteration" error is occurring because Model is demanding data from Generator, But Generator has already Exhausted all the data.

Imagine we have 320 elements in your train_x and are batch size is 32. Thus steps_per_epoch=(320/32)=10.

Thus you have to yield 10 times in every epoch. But because of if condition we won't yield on i=5 and i=10. Thus we are yielding only 8 times but we have told our Model that we will be yielding 10 times through steps_per_epoch.

steps_per_epoch=len(train_x)/batch_s
no_missing_steps=steps_per_epoch/5
steps_per_epoch=steps_per_epoch-no_missing_steps

In case this doesn't work try this. Encapsulate for loop within a while loop.

def get_generator(data, targets, length, batch_size):
    data_gen = TimeseriesGenerator(data, targets, length=length, 
                                   sampling_rate=1, batch_size=batch_size)
    while true:
        for i in range(len(data_gen)):
            if i % 5 != 0:
                x, y = data_gen[i]
                yield x, y

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM