简体   繁体   English

处理大量数据

[英]Process a LOT of data

So I'm working with parametric energy simulations and ended up with 500GB+ of data stored in .CSV files. 因此,我正在进行参数化能源仿真,最终将500GB +的数据存储在.CSV文件中。 I need to be able to process all this data to compare the results and gain insights of the influence of different parameters. 我需要能够处理所有这些数据,以比较结果并深入了解不同参数的影响。

Each csv file name contains information of the parameters used for the simulation so I can not merge the files. 每个csv文件名均包含用于仿真的参数信息,因此我无法合并这些文件。

I normally loaded the .csv files to python using pandas and defining a Class. 我通常使用pandas并定义一个类将.csv文件加载到python。 but now (with all this data) there is not enough memory to do this. 但是现在(包含所有这些数据)没有足够的内存来执行此操作。

Can you point me out a way to process this data? 您能指出我一种处理这些数据的方法吗? I need to be able to do plots and compare the csv files. 我需要能够绘制并比较csv文件。

Thank you for your time. 感谢您的时间。

Convert the csv files to hdf5 , which was created to deal with massive and complex datasets. csv文件转换hdf5 ,该文件旨在处理庞大而复杂的数据集。 It works with pandas as well as other libraries . 它可与熊猫以及其他 图书馆一起使用

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM