[英]Calculate mean YoY percentage change - Pandas DataFrame
I have a Pandas DataFrame with Monthly observations.我有一个每月观察的 Pandas DataFrame。 I'd like to calculate a couple of metrics - MoM and YoY pct change.
我想计算几个指标——MoM 和 YoY pct 变化。
import pandas as pd
import numpy as np
df = pd.DataFrame({
'c': ['A','A','A','B','B','B','C','C'],
'z': [1, 2, 3, 4, 5, 6, 7, 8],
'2018-01': [10, 12, 14, 16, 18, 20, 22, 24],
'2018-02': [12, 14, 16, 18, 20, 22, 24, 26],
'2019-01': [8, 10, 12, 14, 16, 18, 20, 22],
'2019-02': [10, 12, 14, 16, 18, 20, 22, 24]
})
For each z
in c
, I'd like to calculate the MoM
and YoY
change in percentage.对于
c
中的每个z
,我想计算MoM
和YoY
的百分比变化。 This is would be pct
different between observations in month column and aggregate percent change in year
.这在月份列中的观察值和
year
中的聚合百分比变化之间会有pct
不同。
I am looking for a solution that is generalizable across several monthly columns and year.我正在寻找一种可在多个月度专栏和年份中推广的解决方案。
Expected output:预期输出:
c z 2018-01 2018-02 2019-01 2019-02 Avg_YoY_pct
A 1 10 -18.18
A 2 12
A 3 14
B 4 .............................
B 5
B 6
C 7
C 8
Avg_YoY_pct
is calculated as percentage
difference between sum of all monthly values of the year. Avg_YoY_pct
计算为当年所有月度值总和之间的percentage
差异。
Thanks for providing example input so nicely.感谢您提供如此好的示例输入。 Here's an approach that first melts the table into long form and then permforms a groupby to get average YoY for each month, and then another groupby to get average YoY over all months.
这是一种方法,首先将表格融合为长格式,然后执行 groupby 以获得每个月的平均 YoY,然后另一个 groupby 获得所有月份的平均 YoY。 I think it is flexible to more months and years columns
我认为它对更多月份和年份的专栏很灵活
#melt the wide table into a long table
long_df = df.melt(
id_vars=['c','z'],
var_name='date',
value_name='val',
)
#extract the year and month from the date column
long_df[['year','month']] = long_df['date'].str.split('-', expand=True)
long_df['year'] = long_df['year'].astype(int)
long_df['month'] = long_df['month'].astype(int)
#group by c/z/month and shift to get avg yoy for each month
avg_month_yoy = long_df.groupby(['c','z','month'])['val'].apply(
lambda v: v.sub(v.shift(1)).div(v.shift(1)).multiply(100).mean()
).reset_index()
#group by just c/z to get avg yoy over all months
avg_yoy = avg_month_yoy.groupby(['c','z'])['val'].mean()
#Add the avg_yoy back into the original table
df = df.set_index(['c','z'])
df['Avg_YoY_pct'] = avg_yoy
df = df.reset_index()
print(df)
Output输出
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.