简体   繁体   English

根据python中的条件区分两列pandas数据帧

[英]Take difference between two column of pandas dataframe based on condition in python

I have a dataframe named pricecomp_df, I want to take compare the price of column "market price" and each of the other columns like "apple price","mangoes price", "watermelon price" but prioritize the difference based on the condition : (First priority is watermelon price, second to mangoes and third for apple) . 我有一个名为pricecomp_df的数据框,我想比较“市场价格”列和其他每一列的价格,如“苹果价格”,“芒果价格”,“西瓜价格”,但根据条件优先考虑差异: (首先是西瓜价格,第二是芒果,第三是苹果) The input dataframe is given below: 输入数据框如下:

   code  apple price  mangoes price  watermelon price  market price
0   101          101            NaN               NaN           122
1   102          123            123               NaN           124
2   103          NaN            NaN               NaN           123
3   105          123            167               NaN           154
4   107          165            NaN               177           176
5   110          123            NaN               NaN           123

So here the first row has just apple price and market price then take their diff, but in second row, we have apple, mangoes price so i have to take only the difference between market price and mangoes price. 所以这里第一排只有苹果价格和市场价格然后采取他们的差异,但在第二排,我们有苹果,芒果价格所以我只需要采取市场价格和芒果价格之间的差异。 likewise take the difference based on priority condition. 同样根据优先条件采取差异。 Also skip the rows with nan for all three prices. 对于所有三种价格,也跳过带有nan的行。 Can anyone help on this? 任何人都可以帮忙吗?

Hope I'm not too late. 希望我不会太晚。 The idea is to calculate the differences and overwrite them according to your priority list. 我们的想法是计算差异并根据您的优先级列表覆盖它们。

import numpy as np
import pandas as pd

df = pd.DataFrame({'code': [101, 102, 103, 105, 107, 110],
                   'apple price': [101, 123, np.nan, 123, 165, 123],
                   'mangoes price': [np.nan, 123, np.nan, 167, np.nan, np.nan],
                   'watermelon price': [np.nan, np.nan, np.nan, np.nan, 177, np.nan],
                   'market price': [122, 124, 123, 154, 176, 123]})

# Calculate difference to apple price
df['diff'] = df['market price'] - df['apple price']
# Overwrite with difference to mangoes price
df['diff'] = df.apply(lambda x: x['market price'] - x['mangoes price'] if not np.isnan(x['mangoes price']) else x['diff'], axis=1)
# Overwrite with difference to watermelon price
df['diff'] = df.apply(lambda x: x['market price'] - x['watermelon price'] if not np.isnan(x['watermelon price']) else x['diff'], axis=1)

print df
   apple price  code  mangoes price  market price  watermelon price  diff
0          101   101            NaN           122               NaN    21
1          123   102            123           124               NaN     1
2          NaN   103            NaN           123               NaN   NaN
3          123   105            167           154               NaN   -13
4          165   107            NaN           176               177    -1
5          123   110            NaN           123               NaN     0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 根据另一列匹配条件计算两行之间的差异,在 Python / Pandas - Calculating difference between two rows, based on another column match condition, in Python / Pandas 如何在 pandas DataFrame 列中获取 datetime.times 之间的差异 - How to take difference between datetime.times in a pandas DataFrame column 根据 dataframe 中两列之间的差异对字典的值求和,并将第一列除以二 - Python - Sum the value of a dictionary based on the difference between two columns in a dataframe and divide the first column by two - Python Python Pandas Dataframe:根据单独的列取下一个较小的值 - Python Pandas Dataframe: Take next smaller value based on separate column Pandas 新列基于另外两个 dataframe 的条件 - Pandas new column based on condition on two other dataframe 合并两个 pandas dataframe 并根据条件创建一个新的二进制列 - Merge two pandas dataframe and create a new binary column based on condition Pandas Dataframe:根据列条件减除两行值 - Pandas Dataframe: substract and divided two rows value based on column condition 在pandas数据框中创建一列,该列计算两行之间的差异 - create a column in a pandas dataframe which calculates the difference between two rows 如何根据pandas python中的条件在两个不同的数据帧之间进行列操作 - How to make column operations between two different data frames based on a condition in pandas python 根据列值条件删除python pandas数据框中的重复项 - Removing duplicates in python pandas dataframe based in column value condition
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM