简体   繁体   English

我正在尝试找到一种将 numpy 数组转换为 hdf5 格式的方法

[英]I am trying to find a way to convert numpy array to hdf5 format

I am trying to convert Numpy arrays that are 2D grids varying in time in a HDF5 format for several cases so for example the Numpy array has the following aspects: Case Number (0-100), Time (0-200years), X-grid point location (0-100m), y-grid point location (0-20m) plus the actual data point at this location (eg Saturation ranging from 0-100%).我正在尝试转换 Numpy arrays,它们是在 HDF5 格式中随时间变化的二维网格,适用于多种情况,例如 Numpy 数组具有以下方面:案例编号 (0-100)、时间 (0-200 年)、X 网格点位置(0-100m),y 网格点位置(0-20m)加上该位置的实际数据点(例如饱和度范围为 0-100%)。 I am finding a bit difficult to efficiently store in HDF5 format.我发现以 HDF5 格式有效存储有点困难。 Its supposed to be used later to train an RNN model. I tried just assigning a Numpy to an HDF5 format (don't know if it worked as I didn't retrieve it).它应该稍后用于训练 RNN model。我尝试将 Numpy 分配给 HDF5 格式(不知道它是否有效,因为我没有检索到它)。 I was also confused about the different types of storage options for such a case and the best way to store it such that its easily retrievable to train a NN.我也对这种情况的不同类型的存储选项以及存储它的最佳方式感到困惑,以便轻松检索它来训练神经网络。 I need to use HDF5 format as it seems to optimize the use/retrieval of large data as in the current case..I was also trying to find the best way to learn HDF5 format.. Thank you!我需要使用 HDF5 格式,因为它似乎可以像当前情况一样优化大数据的使用/检索。我也在尝试找到学习 HDF5 格式的最佳方法。谢谢!

import h5py
import numpy as np

# Create a numpy array
arr = np.random.rand(3,3)

# Create a HDF5 file
with h5py.File('mydata.h5', 'w') as f:
    # Write the numpy array to the HDF5 file
    f.create_dataset('mydata', data=arr)

You can also use h5py library to append the data to existing hdf5 file instead of creating new one.您还可以使用 h5py 库将数据 append 复制到现有的 hdf5 文件,而不是创建新文件。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM