简体   繁体   English

比较来自相同 pandas dataframe 的 2 列的值和基于比较的第 3 列的返回值

[英]comparing values of 2 columns from same pandas dataframe & returning value of 3rd column based on comparison

I'm trying to compare values between 2 columns in the same pandas dataframe and for where ever the match has been found I want to return the values from that row but from a 3rd column.我正在尝试比较同一 pandas dataframe 中的两列之间的值,并且对于找到匹配项的地方,我想从该行返回值,但从第三列返回。

Basically if the following is dataframe df基本上如果以下是 dataframe df

| date      | date_new   | category | value  |
| --------- | ---------- | -------- | ------ |
|2016-05-11 | 2018-05-15 | day      | 1000.0 |
|2020-03-28 | 2018-05-11 | night    | 2220.1 |
|2018-05-15 | 2020-03-28 | day      | 142.8  |
|2018-05-11 | 2019-01-29 | night    | 1832.9 |

I want to add a new column say, value_new which is basically obtained by getting the values from value after comparing for every date value in date_new for every date value in date followed by comparing if both the rows have same category values.我想添加一个新列,例如value_new ,它基本上是通过在比较date_new中的每个日期值和date中的每个日期值之后从value中获取值,然后比较两行是否具有相同的category值。

[steps of transformation] 【改造步骤】
- 1. for each value in date_new look for a match in date - 1. 对于date_new中的每个值,在date中查找匹配项
- 2. if match found, compare if values in category column also match - 2. 如果找到匹配,比较category列中的值是否也匹配
- 3. if both the matches in above steps fulfilled, pick the corresponding value from value column from the row where both the matches fulfilled, otherwise leave blank. - 3. 如果上述步骤中的两个匹配项都满足,则从两个匹配项都满足的行中的value列中选择相应的值,否则留空。

So, I would finally want the final dataframe to look something like this.所以,我最终希望最终的 dataframe 看起来像这样。

| date      | date_new   | category | value  | value_new |
| --------- | ---------- | -------- | ------ | --------- |
|2016-05-11 | 2018-05-15 | day      | 1000.0 | 142.8     |
|2020-03-28 | 2018-05-11 | night    | 2220.1 | 1832.9    |
|2018-05-15 | 2020-03-28 | day      | 142.8  | None      |
|2018-05-11 | 2016-05-11 | day      | 1832.9 | 1000.0    |

Use DataFrame.merge with left join and assigned new column:使用DataFrame.merge与左连接并分配新列:

df['value_new'] = df.merge(df, 
                           left_on=['date_new','category'], 
                           right_on=['date','category'], how='left')['value_y']
print (df)

         date    date_new category   value  value_new
0  2016-05-11  2018-05-15      day  1000.0      142.8
1  2020-03-28  2018-05-11    night  2220.1        NaN
2  2018-05-15  2020-03-28      day   142.8        NaN
3  2018-05-11  2016-05-11      day  1832.9     1000.0

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas:如果来自第三列的字符串值,则根据另一列的值创建列 - Pandas : Create columns based on values of another column if string value from 3rd column 在Pandas DataFrame中比较2列并填充第3列 - Comparing 2 columns in Pandas DataFrame and populating a 3rd column 如何通过对第三列中的值求和,将前两列中具有相同值的 Pandas Dataframe 的行组合在一起? - How to group together rows of Pandas Dataframe with same values in first 2 columns by summing values in the 3rd column? 比较 2 个 pandas 数据框列并根据值是否相同创建新列 - Comparing 2 pandas dataframe columns and creating new column based on if the values are same or not 熊猫按两列分组,并从第三列输出值 - Pandas groupby two columns and output values from 3rd column 根据同一pandas数据框中的其他列为列分配值 - Assign value to a column based of other columns from the same pandas dataframe 根据熊猫数据框第 3 列中的条件,按天分组的 2 列的加权平均值 - Weighted average, grouped by day, of 2 columns based on criteria in 3rd column of pandas dataframe 如何基于两列删除重复数据,从而删除熊猫数据框中第三列中最大的列? - How to remove duplicates based on two columns removing the the largest of 3rd column in pandas dataframe? Pandas DataFrame根据列,索引值比较更改值 - Pandas DataFrame change a value based on column, index values comparison 是否有一种SQL语法会根据同一表中第3列的相等值搜索2列来创建新列? - Is there a SQL syntax that will create new column by searching in 2 columns based on equal value of 3rd column in same table?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM