[英]hdf5 made with h5py py2 corrupted after opening with h5py in py3
I have a file created with h5py in python 2.7. 我在python 2.7中使用h5py创建了一个文件。
These steps lead to a corruption: 这些步骤导致损坏:
I download a fresh copy of it from a collaborator using scp
. 我使用scp
从协作者那里下载了它的新副本。 It is whole and 286MB. 它是完整的286MB。
I check that it is readable by opening it with hdfview
. 我通过使用hdfview
打开它来检查它是否可读。 This shows all the datasets and groups properly. 这样可以正确显示所有数据集和组。
I exit hdfview. 我退出hdfview。
Repeat steps 2 and 3 to ensure hdfview
is not corrupting the file. 重复步骤2和3,以确保hdfview
不会损坏文件。
I open ipython 3.6 and, 我打开ipython 3.6,
import h5py f = h5py.File(filename,'r') g = f['/sol000']#one group that should be there
I get KeyError: "Unable to open object (Object 'sol000' doesn't exist)"
我收到KeyError: "Unable to open object (Object 'sol000' doesn't exist)"
f.close()
and exit ipython. 我f.close()
并退出ipython。 I again open it with hdfview
and the entire structure is gone. 我再次使用hdfview
打开它,整个结构消失了。 The file is now 4KB. 该文件现在为4KB。 I am able to open the file in python 2 hdf5 and access all the datasets, but must use python 3 for my code. 我能够在python 2 hdf5中打开文件并访问所有数据集,但是我的代码必须使用python 3。
File created on Fedora 24 64-bit, python 2.7, hdf5 2.7.0 在Fedora 24 64位,python 2.7,hdf5 2.7.0上创建的文件
System trying to read it on Fedora 25 64-bit python 3.6, h5py 2.7.0 系统试图在Fedora 25 64位python 3.6,h5py 2.7.0上读取它
On system 1: 在系统1上:
import h5py
import numpy as np
f = h5py.File("file.hdf5","w")
f.create_dataset("/sol000/data",(100,100),dtype=float)
f["/sol000/data"] = np.zeros([100,100],dtype=float)
f.close()
On system 2: Do steps 1-4. 在系统2上:执行步骤1-4。
import h5py
f = h5py.File("file.hdf5","r")
f.visit(lambda *x:print(x))
#(sol000/data,)
f.close()
The solution was to enforce libver=earliest
. 解决方案是强制执行libver=earliest
。 Ie the following code worked to open the file: 即以下代码可用来打开文件:
import h5py
f = f.File("file.hdf5","r",libver="earliest")
I've discovered a possible inconsistency in h5py documentation. 我发现h5py文档中可能存在不一致的地方。 It claims that 它声称
The “earliest” option means that HDF5 will make a best effort to be backwards compatible. “最早”选项意味着HDF5将尽力向后兼容。
The default is “earliest”. 默认值为“最早”。
This can't be true if it only works when I explicitly set it. 如果仅当我明确设置它时才起作用,这不是真的。 My collaborator, it turns out, created the corruptable file with an older version of hdf5 C library. 事实证明,我的合作者使用旧版本的hdf5 C库创建了可损坏的文件。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.