[英]Pandas: How to combine sub-grouped DataFrames to a single DataFrame
我喜歡根據日期將DataFrame分組,並獲取每組的平均值,然后將它們合並為一個DataFrame。
df1= pd.DataFrame({'A' : ['2014-01-01', '2014-01-01', '2014-01-02', '2014-01-03','2014-01-03', '2014-01-04', '2014-01-04', '2014-01-05'],'B' : ['one', 'one', 'two', 'three','two', 'two', 'one', 'three'],'C' : np.random.randn(8), 'D' : np.random.randn(8)})
df1['DT']=pd.to_datetime(df1.A)
df1=df1.set_index('DT') # set 'A' as the index
>>> df1
A B C D
DT
2014-01-01 2014-01-01 one -0.626296 -0.360708
2014-01-01 2014-01-01 one 0.212051 -1.275909
2014-01-02 2014-01-02 two -0.305094 0.351046
2014-01-03 2014-01-03 three 1.136001 1.029615
2014-01-03 2014-01-03 two -0.801339 -0.084780
2014-01-04 2014-01-04 two 0.683201 1.092694
2014-01-04 2014-01-04 one 0.476437 0.250309
2014-01-05 2014-01-05 three -1.007285 0.420201
df2=pd.DataFrame() # New DataFrame
在df1中合並2天的數據
for k in df1.index:
sub=df1[k+dt.timedelta(days=-1):k].mean()
print sub
sub是DataFrame的某種格式,但是如何將它們合並到一個DataFrame df2中呢?
... print sub
...
C -0.207122
D -0.818309
dtype: float64
C -0.207122
D -0.818309
dtype: float64
C -0.239779
D -0.428524
dtype: float64
C 0.009856
D 0.431960
dtype: float64
C 0.009856
D 0.431960
dtype: float64
C 0.373575
D 0.571959
dtype: float64
C 0.373575
D 0.571959
dtype: float64
C 0.050784
D 0.587734
dtype: float64
如果要進行上述計算,可以將結果連接到原始幀,如下所示
res = pd.concat([df1[k+dt.timedelta(days=-1):k].mean() for k in df1.index], axis=1)
df1 = pd.concat([df1, res.T.set_index(df1.index)], axis=1)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.