I'm trying to compare values between 2 columns in the same pandas dataframe and for where ever the match has been found I want to return the values from that row but from a 3rd column.
Basically if the following is dataframe df
| date | date_new | category | value |
| --------- | ---------- | -------- | ------ |
|2016-05-11 | 2018-05-15 | day | 1000.0 |
|2020-03-28 | 2018-05-11 | night | 2220.1 |
|2018-05-15 | 2020-03-28 | day | 142.8 |
|2018-05-11 | 2019-01-29 | night | 1832.9 |
I want to add a new column say, value_new
which is basically obtained by getting the values from value
after comparing for every date value in date_new
for every date value in date
followed by comparing if both the rows have same category
values.
[steps of transformation]
- 1. for each value in date_new
look for a match in date
- 2. if match found, compare if values in category
column also match
- 3. if both the matches in above steps fulfilled, pick the corresponding value from value
column from the row where both the matches fulfilled, otherwise leave blank.
So, I would finally want the final dataframe to look something like this.
| date | date_new | category | value | value_new |
| --------- | ---------- | -------- | ------ | --------- |
|2016-05-11 | 2018-05-15 | day | 1000.0 | 142.8 |
|2020-03-28 | 2018-05-11 | night | 2220.1 | 1832.9 |
|2018-05-15 | 2020-03-28 | day | 142.8 | None |
|2018-05-11 | 2016-05-11 | day | 1832.9 | 1000.0 |
Use DataFrame.merge
with left join and assigned new column:
df['value_new'] = df.merge(df,
left_on=['date_new','category'],
right_on=['date','category'], how='left')['value_y']
print (df)
date date_new category value value_new
0 2016-05-11 2018-05-15 day 1000.0 142.8
1 2020-03-28 2018-05-11 night 2220.1 NaN
2 2018-05-15 2020-03-28 day 142.8 NaN
3 2018-05-11 2016-05-11 day 1832.9 1000.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.