简体   繁体   English

CSV文件中的列数据为h5格式

[英]Column Data in CSV file to h5 format

I am trying to convert a CSV file to h5 format file.我正在尝试将 CSV 文件转换为 h5 格式文件。

I have gone through multiple posts and I have been able to create the h5 file but still unable to pull individual columns from the CSV file and add them to the h5 file, please let me know if there is any solution to this.我浏览了多个帖子,我已经能够创建 h5 文件,但仍然无法从 CSV 文件中提取各个列并将它们添加到 h5 文件中,如果有任何解决方案,请告诉我。

Essentially I have four columns in my CSV file with 4000 observations in each column, trying to check if there is any way to directly convert it to h5 or pull individual column data and edit the existing h5 file.基本上我的 CSV 文件中有四列,每列有 4000 个观察值,试图检查是否有任何方法可以直接将其转换为 h5 或提取单个列数据并编辑现有的 h5 文件。 Thank you.谢谢你。

import pandas as pd

filename = '/home/test3.h5'

df = pd.DataFrame(np.array([[1, 2], [4, 5]]),
                   columns=['a', 'b'])

print(pd.read_hdf(filename, 'data'))

As specified in the pandas I/O guide, section HDF5 (PyTables) , there are 2 simple functions to store as hdf:pandas I/O 指南 HDF5 (PyTables) 部分中所述,有 2 个简单的函数可以存储为 hdf:

So converting a csv to h5 could be as simple as:因此,将 csv 转换为 h5 可以很简单:

df = pd.read_csv('input_file.csv')
df.to_hdf('output_file.h5', 'data')

If you want to combine the data如果要合并数据

df1 = pd.read_csv('input_file.csv')
df2 = pd.read_hdf('input_file.h5', 'data')
save = pd.merge(df1, df2, on=[...]) # combine data
save.to_hdf('output_file.h5', 'data')

If input_file.h5 and output_file.h5 are the same, mode='w' allows to overwrite the file, using different keys with mode='a' (by default) allows to append to the file, append=True allows to append to the dataframe inside the file, etc.如果input_file.h5output_file.h5相同, mode='w'允许覆盖文件,使用mode='a'不同键(默认)允许 append 到文件, append=True允许 append 到文件里面的dataframe等

The guide I linked contains a lot more examples of how to use these tools and also the pd.HDFStore which allows to open the whole file and look into the keys it contains, I suggest you give it a thorough read.我链接的指南包含更多有关如何使用这些工具的示例,以及允许打开整个文件并查看其中包含的密钥的pd.HDFStore ,我建议您仔细阅读。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM