How to read h5 file like csv file

Question

I have such an algorithm that works with csv file object

#diplay_id, ad_id, clicked(1 or 0)
colls = {'display_id':np.int32,
         'ad_id':np.int32,
         'clicked':bool}
trainData = pd.read_csv("trainData.csv")

for did, ad, c in trainData.itertuples():
    print did + ad + c #example

But, now I have a '.h5' file, and I want to use it like in the algorithm. And I am reading the file like in the following;

store = pd.HDFStore('data.h5')

But as I know HDFStore returns np arrays. Do you have any idea to use the data file in the algorithm?

Answer 1

The main difference in this case is the fact that HDF5 files might contain multiple DFs/tables, so you always have to specify a key (identifier).

Here is a small demo:

In [14]: fn = r'C:\Temp\test_str.h5'

In [15]: store = pd.HDFStore(fn)

In [16]: store
Out[16]:
<class 'pandas.io.pytables.HDFStore'>
File path: C:\Temp\test_str.h5
/test            frame_table  (typ->appendable,nrows->10000,ncols->4,indexers->[index],dc->[a,c])

In this case only one DF (key= /test ) is stored in this HDF5 file.

Assuming that all your HDF5 files have only one DF (one key per file) you can process them dynamically by choosing the first key:

In [17]: store.keys()
Out[17]: ['/test']

In [18]: key = store.keys()[0]

In [19]: key
Out[19]: '/test'

In [20]: store[key].head()
Out[20]:
        a       b       c                                                txt
0  689347  129498  770470  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
1  954132   97912  783288  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
2   40548  938326  861212  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
3  869895   39293  242473  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...
4  938918  487643  362942  XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX...

How to read h5 file like csv file

Question

1 answers

solution1
0 ACCPTED 2016-11-11 16:12:27

How to read h5 file like csv file

Question

1 answers

solution1 0 ACCPTED 2016-11-11 16:12:27

solution1
0 ACCPTED 2016-11-11 16:12:27