[英]Multiple dataframe groupby Pandas
我有 2 個數據集可以使用:
ID Date Amount
1 2020-01-02 1000
1 2020-01-09 200
1 2020-01-08 400
以及另一個數據集,它告訴每個 ID 是一周中最頻繁的哪一天和一個月中最頻繁的一周(有多個這樣的 ID)
ID Pref_Day_Of_Week_A Pref_Week_Of_Month_A
1 3 2
對於此 ID,對於 ID 1 而言,星期四是一周中最頻繁的一天,而該月的第二周是該月中最頻繁的一周。
對於所有 ID,我希望找到發生在一周中最頻繁的一天和一個月中最頻繁的一周的所有金額的總和(因此需要 groupby):
ID Amount_On_Pref_Day Amount_Pref_Week
1 1200 600
如果有人可以幫助我使用 Pandas 計算這個數據框,我將不勝感激。 作為參考,我使用此函數查找給定日期的月份中的第幾周:
#https://stackoverflow.com/a/64192858/2901002
def weekinmonth(dates):
"""Get week number in a month.
Parameters:
dates (pd.Series): Series of dates.
Returns:
pd.Series: Week number in a month.
"""
firstday_in_month = dates - pd.to_timedelta(dates.dt.day - 1, unit='d')
return (dates.dt.day-1 + firstday_in_month.dt.weekday) // 7 + 1
想法只過濾匹配的dayofweek
和week
和聚合sum
,最后通過concat
連接在一起:
#https://stackoverflow.com/a/64192858/2901002
def weekinmonth(dates):
"""Get week number in a month.
Parameters:
dates (pd.Series): Series of dates.
Returns:
pd.Series: Week number in a month.
"""
firstday_in_month = dates - pd.to_timedelta(dates.dt.day - 1, unit='d')
return (dates.dt.day-1 + firstday_in_month.dt.weekday) // 7 + 1
df.Date = pd.to_datetime(df.Date)
df['dayofweek'] = df.Date.dt.dayofweek
df['week'] = weekinmonth(df['Date'])
f = lambda x: x.mode().iat[0]
df1 = (df.groupby('ID', as_index=False).agg(Pref_Day_Of_Week_A=('dayofweek',f),
Pref_Week_Of_Month_A=('week',f)))
s1 = df1.rename(columns={'Pref_Day_Of_Week_A':'dayofweek'}).merge(df).groupby('ID')['Amount'].sum()
s2 = df1.rename(columns={'Pref_Week_Of_Month_A':'week'}).merge(df).groupby('ID')['Amount'].sum()
df2 = pd.concat([s1, s2], axis=1, keys=('Amount_On_Pref_Day','Amount_Pref_Week'))
print (df2)
Amount_On_Pref_Day Amount_Pref_Week
ID
1 1200 600
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.