I have data records look like this
category dt userid amt
1 4/14/2019 1 140
1 5/1/2019 1 500
2 5/5/2019 1 300
3 5/19/2019 1 230
2 6/17/2019 1 200
4 6/18/2019 1 400
1 7/30/2019 1 400
1 8/17/2019 1 300
2 12/2/2019 1 200
2 12/23/2019 1 500
1 1/10/2019 2 470
1 2/25/2019 2 450
2 10/4/2019 2 350
Q1: How can I sum the average of the amount per month per year?
user month1 month2 month3 month4 month5 month6 month7 month8 month9 month10 month11 month12 avg_all_month
1 0 0 0 140 343.33 300 400 300 0 0 0 350 305.55
2 470 450 0 0 0 0 0 0 0 350 0 0 423.33
Q2: How to count transaction per category
user pro_cat1 pro_cat2 pro_cat3 pro_cat4 total_product
1 4 3 1 1 7
2 2 1 0 0 3
If there is same year you can use DataFrame.pivot_table
with DataFrame.reindex
and DataFrame.add_prefix
with mean per all months:
df['dt'] = pd.to_datetime(df['dt'])
df2 = (df.pivot_table(index='userid',
columns=df['dt'].dt.month,
values='amt',
aggfunc='mean',
fill_value=0)
.reindex(range(1, 13), axis=1, fill_value=0)
.add_prefix('month')
.assign(avg_all_month = lambda x: df.groupby('userid')['amt'].mean())
.reset_index()
.rename_axis(None, axis=1))
print (df2)
userid month1 month2 month3 month4 month5 month6 month7 month8 \
0 1 0 0 0 140 343.333333 300 400 300
1 2 470 450 0 0 0.000000 0 0 0
month9 month10 month11 month12 avg_all_month
0 0 0 0 350 317.000000
1 0 350 0 0 423.333333
And then for second is used crosstab
with sum
:
df3 = (pd.crosstab(df['userid'],
df['category'])
.add_prefix('pro_')
.assign(total_product = lambda x: x.sum(axis=1))
.reset_index()
.rename_axis(None, axis=1)
)
print (df3)
userid pro_1 pro_2 pro_3 pro_4 total_product
0 1 4 4 1 1 10
1 2 2 1 0 0 3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.