[英]Pandas cumprod with reset indicated by second column
I need to calculate cumulative products that reset with some frequency indicated by a new value in a column Wgt
. 我需要计算以Wgt
列中的新值指示的某些频率重置的累积产品。
For example, in the DataFrame produced by: 例如,在DataFrame中生成:
df = pd.DataFrame(np.random.lognormal(0, 0.01, 27), pd.date_range('2019-01-06', '2019-02-01'), columns=['Chg'])
df['Wgt'] = df['Chg'].asfreq('W')
df.loc[df.Wgt > 0, 'Wgt'] = np.random.uniform(0.5, 1, df.Wgt.count())
Chg Wgt
2019-01-06 1.014571 0.861546
2019-01-07 1.018993 NaN
2019-01-08 1.017461 NaN
2019-01-09 1.003788 NaN
2019-01-10 1.014106 NaN
2019-01-11 0.995758 NaN
2019-01-12 0.989058 NaN
2019-01-13 0.995897 0.602225
2019-01-14 1.007336 NaN
2019-01-15 1.004143 NaN
...
I want to compute a new column Agg
whose value is: 我想计算一个新的列Agg
其值为:
df.Wgt != np.nan
then df.Agg = df.Wgt
如果df.Wgt != np.nan
那么df.Agg = df.Wgt
df.Agg = df.Agg.shift() * df.Chg
否则df.Agg = df.Agg.shift() * df.Chg
Ie, in this example Agg
would be: 即,在这个例子中, Agg
将是:
Chg Wgt Agg
1/6/2019 1.014571 0.861546 0.861546
1/7/2019 1.018993 NaN 0.877909343
1/8/2019 1.017461 NaN 0.893238518
1/9/2019 1.003788 NaN 0.896622106
1/10/2019 1.014106 NaN 0.909269857
1/11/2019 0.995758 NaN 0.905412734
1/12/2019 0.989058 NaN 0.895505708
1/13/2019 0.995897 0.602225 0.602225
1/14/2019 1.007336 NaN 0.606642923
1/15/2019 1.004143 NaN 0.609156244
...
What are pandalicious ways of doing this? 什么是虔诚的做法?
Using np.where
with cumprod
使用np.where
与cumprod
s=df.loc[df.Wgt.isnull(),'Chg'].groupby(df.Wgt.notna().cumsum()).cumprod()
np.where(df.Wgt.notna(),df.Wgt,s*df.Wgt.ffill())
Out[531]:
array([0.861546 , 0.87790934, 0.89323852, 0.89662211, 0.90926986,
0.90541273, 0.89550571, 0.602225 , 0.60664292, 0.60915624])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.