简体   繁体   中英

How to copy a dataset object to a different hdf5 file using pytables or h5py?

I have selected specific hdf5 datasets and want to copy them to a new hdf5 file. I could find some tutorials on copying between two files, but what if you have just created a new file and you want to copy datasets to the file? I thought the way below would work, but it doesn't. Are there any simple ways to do this?

>>> dic_oldDataset['old_dataset']
<HDF5 dataset "old_dataset": shape (333217,), type "|V14">

>>> new_file = h5py.File('new_file.h5', 'a')
>>> new_file.create_group('new_group')

>>> new_file['new_group']['new_dataset'] = dic_oldDataset['old_dataset']


RuntimeError: Unable to create link (interfile hard links are not allowed)

Answer 1 (using h5py):
This creates a simple structured array to populate the first dataset in the first file. The data is then read from that dataset and copied to the second file using my_array .

import h5py, numpy as np

arr = np.array([(1,'a'), (2,'b')], 
      dtype=[('foo', int), ('bar', 'S1')]) 
print (arr.dtype)

h5file1 = h5py.File('test1.h5', 'w')
h5file1.create_dataset('/ex_group1/ex_ds1', data=arr)                
print (h5file1)

my_array=h5file1['/ex_group1/ex_ds1']

h5file2 = h5py.File('test2.h5', 'w')
h5file2.create_dataset('/exgroup2/ex_ds2', data=my_array)
print (h5file2)

h5file1.close()
h5file2.close()

Answer 2 (using pytables):
This follows the same process as above with pytables functions. It creates the same simple structured array to populate the first dataset in the first file. The data is then read from that dataset and copied to the second file using my_array .

import tables, numpy as np

arr = np.array([(1,'a'), (2,'b')], 
      dtype=[('foo', int), ('bar', 'S1')]) 
print (arr.dtype)
h5file1 = tables.open_file('test1.h5', mode = 'w', title = 'Test file')
my_group = h5file1.create_group('/', 'ex_group1', 'Example Group')
my_table = h5file1.create_table(my_group, 'ex_ds1', None, 'Example dataset', obj=arr)                
print (h5file1)

my_array=my_table.read()

h5file2 = tables.open_file('test2.h5', mode = 'w', title = 'Test file')
h5file2.create_table('/exgroup2', 'ex_ds2', createparents=True, obj=my_array)
print (h5file2)

h5file1.close()
h5file2.close()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM