简体   繁体   中英

Keras: Loading minibatches from HDF5 and CSV

I have a large dataset, too large to fit into RAM, which is available either as HDF5 or CSV. How can I feed it into Keras in minibatches? Also, will this shuffle it for me, or do I need to pre-shuffle the dataset?

(I'm also interested in this when the input is a Numpy recarray; since Keras I believe wants the input to be a ndarray.)

And, if I want to do some lightweight preprocessing in Keras before learning (eg apply a few Python functions to the data to change the representation), hcan that be added?

Have a look at the fit_generator method available with Keras here: https://keras.io/models/sequential/#sequential-model-methods It fits the model on data generated batch-by-batch by a Python generator (Where you can write shuffling logic, since generator is under your control).

You may apply call pre-processing within the generator itself.

Hope this helps.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM