简体   繁体   中英

Process a LOT of data

So I'm working with parametric energy simulations and ended up with 500GB+ of data stored in .CSV files. I need to be able to process all this data to compare the results and gain insights of the influence of different parameters.

Each csv file name contains information of the parameters used for the simulation so I can not merge the files.

I normally loaded the .csv files to python using pandas and defining a Class. but now (with all this data) there is not enough memory to do this.

Can you point me out a way to process this data? I need to be able to do plots and compare the csv files.

Thank you for your time.

Convert the csv files to hdf5 , which was created to deal with massive and complex datasets. It works with pandas as well as other libraries .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM