Creating a new pandas dataframe column by looking up values in other rows

Question

I would like to find a faster way to calculate the sales 52 weeks ago column for each product below without using iterrows or itertuples. Any suggestions? Input will be the table without "sales 52 weeks ago column" and output will be the entire table below.

         date  sales city product  sales 52 weeks ago
0  2020-01-01    1.5   c1      p1       0.6
1  2020-01-01    1.2   c1      p2       0.3
2  2019-05-02    0.5   c1      p1       nan
3  2019-01-02    0.3   c1      p2       nan
4  2019-01-02    0.6   c1      p1       nan
5  2019-01-01    1.2   c1      p2       nan

Example itertuples code but really slow:

for row in df.itertuples(index=True, name='Pandas'):
    try:
        df.at[row.Index, 'sales 52 weeks ago']=df[(df['date']==row.date-timedelta(weeks=52))&(df['product']==row.product),'sales']
    except:
        continue

Answer 1

You need a merge after subtracting the date with Timedelta :

m=df['date'].sub(pd.Timedelta('52W')).to_frame().assign(product=df['product'])
final = df.assign(sales_52_W_ago=m.merge(df,
         on=['date','product'],how='left').loc[:,'sales'])

        date  sales city product  sales_52_W_ago
0 2020-01-01    1.5   c1      p1             0.6
1 2020-01-01    1.2   c1      p2             0.3
2 2019-05-02    0.5   c1      p1             NaN
3 2019-01-02    0.3   c1      p2             NaN
4 2019-01-02    0.6   c1      p1             NaN
5 2019-01-01    1.2   c1      p2             NaN

Creating a new pandas dataframe column by looking up values in other rows

Question

1 answers

solution1
0 ACCPTED 2020-01-03 05:10:08

Creating a new pandas dataframe column by looking up values in other rows

Question

1 answers

solution1 0 ACCPTED 2020-01-03 05:10:08

solution1
0 ACCPTED 2020-01-03 05:10:08