I have a file created with h5py in python 2.7.
These steps lead to a corruption:
I download a fresh copy of it from a collaborator using scp
. It is whole and 286MB.
I check that it is readable by opening it with hdfview
. This shows all the datasets and groups properly.
I exit hdfview.
Repeat steps 2 and 3 to ensure hdfview
is not corrupting the file.
I open ipython 3.6 and,
import h5py f = h5py.File(filename,'r') g = f['/sol000']#one group that should be there
I get KeyError: "Unable to open object (Object 'sol000' doesn't exist)"
f.close()
and exit ipython. I again open it with hdfview
and the entire structure is gone. The file is now 4KB. I am able to open the file in python 2 hdf5 and access all the datasets, but must use python 3 for my code.
File created on Fedora 24 64-bit, python 2.7, hdf5 2.7.0
System trying to read it on Fedora 25 64-bit python 3.6, h5py 2.7.0
On system 1:
import h5py
import numpy as np
f = h5py.File("file.hdf5","w")
f.create_dataset("/sol000/data",(100,100),dtype=float)
f["/sol000/data"] = np.zeros([100,100],dtype=float)
f.close()
On system 2: Do steps 1-4.
import h5py
f = h5py.File("file.hdf5","r")
f.visit(lambda *x:print(x))
#(sol000/data,)
f.close()
The solution was to enforce libver=earliest
. Ie the following code worked to open the file:
import h5py
f = f.File("file.hdf5","r",libver="earliest")
I've discovered a possible inconsistency in h5py documentation. It claims that
The “earliest” option means that HDF5 will make a best effort to be backwards compatible.
The default is “earliest”.
This can't be true if it only works when I explicitly set it. My collaborator, it turns out, created the corruptable file with an older version of hdf5 C library.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.