简体   繁体   English

如何连接给定目录中的所有 HDF5 文件?

[英]How do you concatenate all the HDF5 files in a given directory?

I have many HDF5 files in a directory and I want to concatenate all of them.我在一个目录中有许多 HDF5 文件,我想连接所有这些文件。 I tried the following:我尝试了以下方法:

from glob import iglob
import shutil
import os

PATH = r'C:\Dropbox\data_files'

destination = open('data.h5','wb')
for filename in iglob(os.path.join(PATH, '*.h5')):
    shutil.copyfileobj(open(filename, 'rb'), destination)
destination.close()

However, this only creates an empty file.但是,这只会创建一个空文件。 Each HDF5 file contains two datasets, but I only care about taking the second one (which is named the same thing in each) and adding it to a new file.每个 HDF5 文件都包含两个数据集,但我只关心获取第二个数据集(每个文件的名称相同)并将其添加到新文件中。

Is there a better way of concatenating HDF files?有没有更好的方法来连接 HDF 文件? Is there a way to fix my method?有没有办法解决我的方法?

You can combine ipython with h5py module and h5copy tool.您可以将ipythonh5py 模块h5copy工具结合使用。

Once installed h5copy ahd h5py just open the ipython console in the folder where all your .h5 files are stored and use this code to merge them in a output.h5 file:安装 h5copy ahd h5py 后,只需在存储所有 .h5 文件的文件夹中打开 ipython 控制台,然后使用此代码将它们合并到output.h5文件中:

import h5py
import os 
d_names = os.listdir(os.getcwd())
d_struct = {} #Here we will store the database structure
for i in d_names:
   f = h5py.File(i,'r+')
   d_struct[i] = f.keys()
   f.close()

for i in d_names:
    for j  in d_struct[i]:
          !h5copy -i '{i}' -o 'output.h5' -s {j} -d {j}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM