![](/img/trans.png)
[英]How to resample weekly data from daily data with groupby in pandas?
[英]How to groupby and resample data in pandas?
我有不同日期不同客戶的銷售數據。 但是日期不是連續的,我想將數據重新采樣為每日頻率。 我怎樣才能做到這一點?
import numpy as np
import pandas as pd
df = pd.DataFrame({'id': list('aababcbc'),
'date': pd.date_range('2022-01-01',periods=8),
'value':range(8)}).sort_values('id')
df
id date value
0 a 2022-01-01 0
1 a 2022-01-02 1
3 a 2022-01-04 3
2 b 2022-01-03 2
4 b 2022-01-05 4
6 b 2022-01-07 6
5 c 2022-01-06 5
7 c 2022-01-08 7
所需的 output 如下
id date value
a 2022-01-01 0
a 2022-01-02 1
a 2022-01-03 0 ** there is no data for a in this day
a 2022-01-04 3
b 2022-01-03 2
b 2022-01-04 0 ** there is no data for b in this day
b 2022-01-05 4
b 2022-01-06 0 ** there is no data for b in this day
b 2022-01-07 6
c 2022-01-06 5
c 2022-01-07 0 ** there is no data for c in this day
c 2022-01-08 7
df.groupby(['id']).resample('D',on='date')['value'].sum().reset_index()
df["date"] = pd.to_datetime(df["date"])
df.set_index("date").groupby("id").resample("1d").sum()
def f(df):
return df.resample('D', on='date')['value'].sum()
df.groupby(['id']).apply(f).reset_index()
產生:
id date value
0 a 2022-01-01 0
1 a 2022-01-02 1
2 a 2022-01-03 0
3 a 2022-01-04 3
4 b 2022-01-03 2
5 b 2022-01-04 0
6 b 2022-01-05 4
7 b 2022-01-06 0
8 b 2022-01-07 6
9 c 2022-01-06 5
10 c 2022-01-07 0
11 c 2022-01-08 7
這是我想出的解決方案:
df.groupby(['id']).apply(lambda x: x.resample('D',on='date')['value'].sum()).reset_index()
id date value
0 a 2022-01-01 0
1 a 2022-01-02 1
2 a 2022-01-03 0
3 a 2022-01-04 3
4 b 2022-01-03 2
5 b 2022-01-04 0
6 b 2022-01-05 4
7 b 2022-01-06 0
8 b 2022-01-07 6
9 c 2022-01-06 5
10 c 2022-01-07 0
11 c 2022-01-08 7
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.