![](/img/trans.png)
[英]Python: How to build a historical dataframe of daily updated time series?
[英]How to aggregate data by historical time series values in python dataframe?
我有一個這樣的數據框,
import pandas as pd
d = {'ID':["A","A","A","A","A","A","A","A","A","A","A","A"],
'date':["2017-01-01","2017-01-01","2017-01-01","2017-01-02","2017-01-02","2017-01-02","2017-01-03","2017-01-03",
"2017-01-03","2017-01-04","2017-01-04","2017-01-04"],
'time':["00:00","06:00","12:00","00:00","06:00","12:00","00:00","06:00","12:00","00:00","06:00","12:00"],
'value':[23,100,330,57,122,477,46,99,469,37,118,499]}
df = pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'])
print(df)
ID date time value
0 A 2017-01-01 00:00 23
1 A 2017-01-01 06:00 100
2 A 2017-01-01 12:00 330
3 A 2017-01-02 00:00 57
4 A 2017-01-02 06:00 122
5 A 2017-01-02 12:00 477
6 A 2017-01-03 00:00 46
7 A 2017-01-03 06:00 99
8 A 2017-01-03 12:00 469
9 A 2017-01-04 00:00 37
10 A 2017-01-04 06:00 118
11 A 2017-01-04 12:00 499
我想生成一個新列,其中包含基於時間順序的歷史數據。 最終的數據幀就是這樣,
ID date time value avg
0 A 2017-01-01 00:00 23 23
1 A 2017-01-01 06:00 100 100
2 A 2017-01-01 12:00 330 330
3 A 2017-01-02 00:00 57 23
4 A 2017-01-02 06:00 122 100
5 A 2017-01-02 12:00 477 330
6 A 2017-01-03 00:00 46 40 # (23+57)/2 = 40
7 A 2017-01-03 06:00 99 111 # (100+122)/2 = 111
8 A 2017-01-03 12:00 469 403.5 # (330+477)/2 = 403.5
9 A 2017-01-04 00:00 37 42 # (23+57+46)/3 = 42
10 A 2017-01-04 06:00 118 107 # (100+122+99)/3 = 107
11 A 2017-01-04 12:00 499 425.3 # (330+477+469)/3 = 425.333
新列avg計算相同歷史時間點的數據平均值。 因此,前兩天是相同的-只需復制第一天的數據即可。 然后,第三天將是前兩天的平均值,依此類推。
這只是一個樣本數據集。 我希望有人可以解決此問題。 謝謝!
IIUC,讓我們嘗試一下:
df.set_index(['ID','time','date'])['value']\
.unstack([0,1])\
.rolling(len(df),min_periods=1)\
.mean().shift(1).bfill()\
.unstack().rename('avg')\
.to_frame()\
.join(df.set_index(['ID','time','date']))\
.reset_index().sort_values(['ID','date','time'])
輸出:
ID time date avg value
0 A 00:00 2017-01-01 23.000000 23
4 A 06:00 2017-01-01 100.000000 100
8 A 12:00 2017-01-01 330.000000 330
1 A 00:00 2017-01-02 23.000000 57
5 A 06:00 2017-01-02 100.000000 122
9 A 12:00 2017-01-02 330.000000 477
2 A 00:00 2017-01-03 40.000000 46
6 A 06:00 2017-01-03 111.000000 99
10 A 12:00 2017-01-03 403.500000 469
3 A 00:00 2017-01-04 42.000000 37
7 A 06:00 2017-01-04 107.000000 118
11 A 12:00 2017-01-04 425.333333 499
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.