[英]Read n rows from an h5 file
I want to read only 10 rows from an h5 file: 我只想从h5文件中读取10 行 :
df = pd.read_hdf('data.h5', 'cleanuserbase', start=0, stop=10)
But that doesn't work because it reads all the rows. 但这是行不通的,因为它会读取所有行。
This works only if your object is table format (rather than fixed format ). 仅当您的对象是表格式(而不是固定格式 )时,此方法才有效。
In [11]: df = pd.DataFrame(np.random.randn(100, 2))
In [12]: store = pd.HDFStore('store.h5')
In [13]: df.to_hdf("store.h5", "df", format="table")
In [14]: store.select("df", "index < 2")
Out[14]:
0 1
0 -0.245982 -1.047534
1 -0.633943 -1.218812
In [15]: pd.read_hdf("store.h5", "df", start=0, stop=2) # works if non-integer index
Out[15]:
0 1
0 -0.245982 -1.047534
1 -0.633943 -1.218812
See table format in the docs. 请参阅文档中的表格格式 。
If your table is fixed format it can only be read in whole (but perhaps this should raise ): 如果您的表格是固定格式,则只能全部读取(但是应该会引起这个问题 ):
In [21]: df.to_hdf("store.h5", "fixed_df", format="fixed")
In [22]: pd.read_hdf("store.h5", "fixed_df", start=0, stop=2)
Out[22]:
0 1
0 2.532604 -0.084852
1 0.735833 -1.100600
2 -0.415245 -2.050627
3 -0.915045 -0.638667
... # and all the other rows
This is not implemented for fixed
stores ATM (but works for table
stores, see Andy's answer), see the open issue here 这不适用于
fixed
存储的ATM(但适用于table
存储,请参见Andy的答案),请参见此处的未解决问题
That said, the stores themselves do actually support indexing. 也就是说,商店本身确实支持索引编制。 Its just not built out.
它只是没有建立。 This is peeking into the internals.
这正在窥探内部。
In [35]: df = DataFrame(np.random.randn(10,2),columns=list('ab'))
In [36]: store = pd.HDFStore('test.h5',mode='w')
In [37]: store.put('df',df)
In [38]: store
Out[38]:
<class 'pandas.io.pytables.HDFStore'>
File path: test.h5
/df frame (shape->[10,2])
In [39]: mask = slice(4,10)
In [40]: s = store.get_storer('df').storable
In [41]: DataFrame(s.block0_values[mask],index=s.axis1[mask],columns=s.axis0)
Out[41]:
axis0 a b
4 -1.347325 -0.936605
5 -0.342814 -0.452055
6 0.951228 0.160918
7 -0.096133 0.816032
8 -0.731431 1.190047
9 -1.050826 0.348107
In [42]: store.close()
I suppose this could raise NotImplementedError
until this issue is resolved. 我想这可能会引发
NotImplementedError
直到解决此问题为止。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.