I have a dataframe which has the following columns:
Date | Zip | Price | |
---|---|---|---|
0 | 2019-01-01 | 90102 | 58.02 |
1 | 2019-01-01 | 90102 | 81.55 |
2 | 2019-01-01 | 90102 | 11.97 |
3 | 2019-01-01 | 90102 | 93.23 |
4 | 2019-01-01 | 90103 | 13.68 |
I want to create a 4th column which should have ratio of price based on the max price in that zip and on that date.
So I have used another df called df_max_price
df_max_price = df.groupby(['Date','Zip'], as_index=False)['Price'].max()
Date | Zip | Price | |
---|---|---|---|
0 | 2019-01-01 | 90102 | 93.23 |
1 | 2019-01-01 | 90103 | 13.68 |
Now I want to have a new column in the df which shall be the ratio of Price and maxprice for that date and zip code
Date | Zip | Price | Ratio | |
---|---|---|---|---|
0 | 2019-01-01 | 90102 | 58.02 | 0.622 |
1 | 2019-01-01 | 90102 | 81.55 | 0.875 |
2 | 2019-01-01 | 90102 | 11.97 | 0.128 |
3 | 2019-01-01 | 90102 | 93.23 | 1.000 |
4 | 2019-01-01 | 90103 | 13.68 | 1.000 |
The calculation is based on 58.02/93.23
and so on.
Can someone help me showing how it can be done using lambda function.
Use groupby
and transform
df['ratio'] = df['Price'] / df.groupby(['Date','Zip'], as_index=False)['Price'].transform('max')['Price']
Date Zip Price ratio
0 2019-01-01 90102 58.02 0.622332
1 2019-01-01 90102 81.55 0.874718
2 2019-01-01 90102 11.97 0.128392
3 2019-01-01 90102 93.23 1.000000
4 2019-01-01 90103 13.68 1.000000
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.