简体   繁体   English

从h5文件读取n行

[英]Read n rows from an h5 file

I want to read only 10 rows from an h5 file: 我只想从h5文件中读取10

df = pd.read_hdf('data.h5', 'cleanuserbase', start=0, stop=10)

But that doesn't work because it reads all the rows. 但这是行不通的,因为它会读取所有行。

This works only if your object is table format (rather than fixed format ). 仅当您的对象是表格式(而不是固定格式 )时,此方法才有效。

In [11]: df = pd.DataFrame(np.random.randn(100, 2))

In [12]: store = pd.HDFStore('store.h5')

In [13]: df.to_hdf("store.h5", "df", format="table")

In [14]: store.select("df", "index < 2")
Out[14]:
          0         1
0 -0.245982 -1.047534
1 -0.633943 -1.218812

In [15]: pd.read_hdf("store.h5", "df", start=0, stop=2)  # works if non-integer index
Out[15]:
          0         1
0 -0.245982 -1.047534
1 -0.633943 -1.218812

See table format in the docs. 请参阅文档中的表格格式


If your table is fixed format it can only be read in whole (but perhaps this should raise ): 如果您的表格是固定格式,则只能全部读取(但是应该会引起这个问题 ):

In [21]: df.to_hdf("store.h5", "fixed_df", format="fixed")

In [22]: pd.read_hdf("store.h5", "fixed_df", start=0, stop=2)
Out[22]:
           0         1
0   2.532604 -0.084852
1   0.735833 -1.100600
2  -0.415245 -2.050627
3  -0.915045 -0.638667
...  # and all the other rows

This is not implemented for fixed stores ATM (but works for table stores, see Andy's answer), see the open issue here 这不适用于fixed存储的ATM(但适用于table存储,请参见Andy的答案),请参见此处的未解决问题

That said, the stores themselves do actually support indexing. 也就是说,商店本身确实支持索引编制。 Its just not built out. 它只是没有建立。 This is peeking into the internals. 这正在窥探内部。

In [35]: df = DataFrame(np.random.randn(10,2),columns=list('ab'))

In [36]: store = pd.HDFStore('test.h5',mode='w')

In [37]: store.put('df',df)

In [38]: store
Out[38]: 
<class 'pandas.io.pytables.HDFStore'>
File path: test.h5
/df            frame        (shape->[10,2])

In [39]: mask = slice(4,10)

In [40]: s = store.get_storer('df').storable

In [41]: DataFrame(s.block0_values[mask],index=s.axis1[mask],columns=s.axis0)
Out[41]: 
axis0         a         b
4     -1.347325 -0.936605
5     -0.342814 -0.452055
6      0.951228  0.160918
7     -0.096133  0.816032
8     -0.731431  1.190047
9     -1.050826  0.348107

In [42]: store.close()

I suppose this could raise NotImplementedError until this issue is resolved. 我想这可能会引发NotImplementedError直到解决此问题为止。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM