I would like to find a faster way to calculate the sales 52 weeks ago column for each product below without using iterrows or itertuples. Any suggestions? Input will be the table without "sales 52 weeks ago column" and output will be the entire table below.
date sales city product sales 52 weeks ago
0 2020-01-01 1.5 c1 p1 0.6
1 2020-01-01 1.2 c1 p2 0.3
2 2019-05-02 0.5 c1 p1 nan
3 2019-01-02 0.3 c1 p2 nan
4 2019-01-02 0.6 c1 p1 nan
5 2019-01-01 1.2 c1 p2 nan
Example itertuples code but really slow:
for row in df.itertuples(index=True, name='Pandas'):
try:
df.at[row.Index, 'sales 52 weeks ago']=df[(df['date']==row.date-timedelta(weeks=52))&(df['product']==row.product),'sales']
except:
continue
You need a merge after subtracting the date with Timedelta
:
m=df['date'].sub(pd.Timedelta('52W')).to_frame().assign(product=df['product'])
final = df.assign(sales_52_W_ago=m.merge(df,
on=['date','product'],how='left').loc[:,'sales'])
date sales city product sales_52_W_ago
0 2020-01-01 1.5 c1 p1 0.6
1 2020-01-01 1.2 c1 p2 0.3
2 2019-05-02 0.5 c1 p1 NaN
3 2019-01-02 0.3 c1 p2 NaN
4 2019-01-02 0.6 c1 p1 NaN
5 2019-01-01 1.2 c1 p2 NaN
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.