I am trying to read in a hdf file but no groups show up. I have tried a couple different methods using tables and h5py but neither work in displaying the groups in the file. I checked and the file is 'Hierarchical Data Format (version 5) data' (See Update). The file information is here for a reference.
Example data can be found here
import h5py
import tables as tb
hdffile = "TRMM_LIS_SC.04.1_2010.260.73132"
Using h5py:
f = h5py.File(hdffile,'w')
print(f)
Outputs:
< HDF5 file "TRMM_LIS_SC.04.1_2010.260.73132" (mode r+) >
[]
Using tables:
fi=tb.openFile(hdffile,'r')
print(fi)
Outputs:
TRMM_LIS_SC.04.1_2010.260.73132 (File) ''
Last modif.: 'Wed Aug 10 18:41:44 2016'
Object Tree:
/ (RootGroup) ''
Closing remaining open files:TRMM_LIS_SC.04.1_2010.260.73132...done
UPDATE
h5py.File(hdffile,'w') overwrote the file and emptied it.
Now my question is how to read in a hdf version 4 file into python since h5py and tables both do not work?
How big is the file? I think that doing h5py.File(hdffile,'w')
overwrites it, so it's empty. Use h5py.File(hdffile,'r')
to read.
I don't have enough karma to reply to @Luke H's answer, but reading it into pandas might not be a good idea. Pandas hdf5 uses pytables, which is an "opinionated" way of using hdf5. This means that it stores extra metadata (eg. the index). So I would only use pytables to read the file if it was made with pytables.
UPDATE:
i would recommend you first to convert your HDF version 4 files to HDF5 / h5 files as all modern libraries / modules are working with HDF version 5...
OLD answer:
try it this way:
store = pd.HDFStore(filename)
print(store)
this should print you details about the HDF file, including stored keys, lengths of stored DFs, etc.
Demo:
In [18]: fn = r'C:\Temp\a.h5'
In [19]: store = pd.HDFStore(fn)
In [20]: print(store)
<class 'pandas.io.pytables.HDFStore'>
File path: C:\Temp\a.h5
/df_dc frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index],dc->[a,b,c])
/df_no_dc frame_table (typ->appendable,nrows->10,ncols->3,indexers->[index])
now you can read dataframes using keys from the output above:
In [21]: df = store.select('df_dc')
In [22]: df
Out[22]:
a b c
0 92 80 86
1 27 49 62
2 55 64 60
3 31 66 3
4 37 75 81
5 49 69 87
6 59 0 87
7 69 91 39
8 93 75 31
9 21 15 7
Try using pandas:
import pandas as pd
f = pd.read_hdf(C:/path/to/file)
See Pandas HDF documentation here.
This should read in any hdf file as a dataframe you can then manipulate.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.