[英]Take difference between two column of pandas dataframe based on condition in python
I have a dataframe named pricecomp_df, I want to take compare the price of column "market price" and each of the other columns like "apple price","mangoes price", "watermelon price" but prioritize the difference based on the condition : (First priority is watermelon price, second to mangoes and third for apple) . 我有一个名为pricecomp_df的数据框,我想比较“市场价格”列和其他每一列的价格,如“苹果价格”,“芒果价格”,“西瓜价格”,但根据条件优先考虑差异: (首先是西瓜价格,第二是芒果,第三是苹果) 。 The input dataframe is given below: 输入数据框如下:
code apple price mangoes price watermelon price market price
0 101 101 NaN NaN 122
1 102 123 123 NaN 124
2 103 NaN NaN NaN 123
3 105 123 167 NaN 154
4 107 165 NaN 177 176
5 110 123 NaN NaN 123
So here the first row has just apple price and market price then take their diff, but in second row, we have apple, mangoes price so i have to take only the difference between market price and mangoes price. 所以这里第一排只有苹果价格和市场价格然后采取他们的差异,但在第二排,我们有苹果,芒果价格所以我只需要采取市场价格和芒果价格之间的差异。 likewise take the difference based on priority condition. 同样根据优先条件采取差异。 Also skip the rows with nan for all three prices. 对于所有三种价格,也跳过带有nan的行。 Can anyone help on this? 任何人都可以帮忙吗?
Hope I'm not too late. 希望我不会太晚。 The idea is to calculate the differences and overwrite them according to your priority list. 我们的想法是计算差异并根据您的优先级列表覆盖它们。
import numpy as np
import pandas as pd
df = pd.DataFrame({'code': [101, 102, 103, 105, 107, 110],
'apple price': [101, 123, np.nan, 123, 165, 123],
'mangoes price': [np.nan, 123, np.nan, 167, np.nan, np.nan],
'watermelon price': [np.nan, np.nan, np.nan, np.nan, 177, np.nan],
'market price': [122, 124, 123, 154, 176, 123]})
# Calculate difference to apple price
df['diff'] = df['market price'] - df['apple price']
# Overwrite with difference to mangoes price
df['diff'] = df.apply(lambda x: x['market price'] - x['mangoes price'] if not np.isnan(x['mangoes price']) else x['diff'], axis=1)
# Overwrite with difference to watermelon price
df['diff'] = df.apply(lambda x: x['market price'] - x['watermelon price'] if not np.isnan(x['watermelon price']) else x['diff'], axis=1)
print df
apple price code mangoes price market price watermelon price diff
0 101 101 NaN 122 NaN 21
1 123 102 123 124 NaN 1
2 NaN 103 NaN 123 NaN NaN
3 123 105 167 154 NaN -13
4 165 107 NaN 176 177 -1
5 123 110 NaN 123 NaN 0
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.