简体   繁体   English

如何在HDF5中制作多维度数据集?

[英]How to make a multi-dims dataset in HDF5?

For example,I want to make two datasets, one is Input ,the other is Output 例如,我要制作两个数据集,一个是Input ,另一个是Output

The data in Input and Output are multi-dims. InputOutput中的数据是多维的。

such as

在此处输入图片说明

But I notice in h5py , input_node and output_node is fixed. 但是我注意到在h5pyinput_nodeoutput_node是固定的。

Input =  f.create_dataset('Input',  (3,input_node ),dtype='float', chunks=True)
Output = f.create_dataset('Output', (3,output_node),dtype='float', chunks=True)

But hdf5 can't handle this,this code can prove it 但是hdf5无法处理此问题,此代码可以证明这一点

import h5py

X = [[1,2,3,4],[1,2],[1,2,3,4,5,6]]

with h5py.File('myfile.hdf5', "w") as ofile:
    ofile.create_dataset("X", data=X)

TypeError: Object dtype dtype('O') has no native HDF5 equivalent TypeError:对象dtype dtype('O')没有等效的本机HDF5

So how to make a multi-dims dataset in h5py ? 那么如何在h5py制作一个多维数据集呢?

I don't quite follow what your {...} denote. 我不太理解您的{...}表示的意思。 In Python those are used for dictionaries and sets. 在Python中,这些用于字典和集合。 [] are used for lists, () for tuples. []用于列表, ()用于元组。 Array shape is expressed as a tuple. 数组形状表示为元组。

Anyways, your code produces 无论如何,您的代码会产生

In [68]: X
Out[68]: 
array([ list([0.6503719194043309, 0.8703218883225239, -1.4139639093161405, 2.3288987644271835, -1.7957516518177206]),
       list([-0.1781710442823114, 0.9591992379396287, -0.6319292685053243]),
       list([0.7104492662861611, -0.8951817329357393, -0.8925882332063567, 1.5587934871464815]),
       list([-1.2384976614455354, 0.9044140291496179, 1.1277220227448401]),
       list([1.1386910680393805, -0.1775792543137636, 1.0567836199711476]),
       list([2.7535019220459707, 0.29518918092088386, -0.32166742909305196, 1.5269788560083497, 0.29633276686886767]),
       list([1.6397535315116918, -0.8839570613086122, -0.4491121599234047, -2.4461439611764333, -0.6884616200199412, -1.1920165045444608]),
       list([1.3240629024597295, 1.170019287452736, 0.5999977019629572, -0.38338543090263366, 0.6030856099472732]),
       list([-0.013529997305716175, -0.7093551284624415, -1.8611980839518099, 0.9165791506693297]),
       list([2.384081118320432, -0.6158201308053464, 0.8802896893269192, -0.7636283160361232])], dtype=object)
In [69]: y
Out[69]: array([1, 1, 0, 0, 0, 1, 1, 0, 1, 0])

y is a simple array. y是一个简单的数组。 h5py should have no problem saving that. h5py保存该h5py应该没有问题。

X is an object dtype array, containing lists of varying size X是对象dtype数组,包含大小可变的列表

In [72]: [len(l) for l in X]
Out[72]: [5, 3, 4, 3, 3, 5, 6, 5, 4, 4]

h5py cannot save that kind of array. h5py无法保存这种数组。 At best you can write each element to a different dataset . 充其量您可以将每个元素写入不同的dataset It will save each as an array. 它将每个保存为数组。

....
   for i, item in enumerate(X):
      ofile.create_dataset('name%s'%i, data=item)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM