简体   繁体   English

在 AND 和 OR 不等于的条件组合上合并 pandas dataframe

[英]Merge pandas dataframe on a combination of conditions with AND and OR not Equal

Given: two dataframes below给定:下面的两个数据框

df1:
| Company  | Package | Badge Number | Work Date  |
|----------|---------|--------------|------------|
| Compnay1 | X       | 1            | 2020-01-01 |
| Company2 | X       | 2            | 2020-01-01 |

df2:
| Company  | Package | Badge Number | Work Date  |
|----------|---------|--------------|------------|
| Compnay1 | X       | 1            | 2020-01-01 |
| Compnay1 | Y       | 1            | 2020-01-01 |
| Company2 | X       | 1            | 2020-01-01 |
| Company2 | Y       | 1            | 2020-01-01 |
| Company2 | X       | 2            | 2020-01-01 |

What's needed: I need to write python code which will be similar to this SQL statement.需要什么:我需要编写类似于此 SQL 语句的 python 代码。

SELECT * 
FROM df1
INNER JOIN df2
ON df1.[Badge Number] = df2.[Badge Number]
AND df1.[Work Date] = df2.[Work Date]
AND (df1.[Company] != df2.[Company] OR df1.[Package] != df2.[Package])

result:结果:

| df1.Company | df1.Package | df1.Badge Number | df1.Work Date | df2.Company | df2.Package | df2.Badge Number | df2.Work Date |
|-------------|-------------|------------------|---------------|-------------|-------------|------------------|---------------|
| Compnay1    | X           | 1                | 2020-01-01    | Compnay1    | Y           | 1                | 2020-01-01    |
| Compnay1    | X           | 1                | 2020-01-01    | Company2    | X           | 1                | 2020-01-01    |
| Compnay1    | X           | 1                | 2020-01-01    | Company2    | Y           | 1                | 2020-01-01    |

Can this be done purely in pandas without needed to write SQL queries in the python code?这可以纯粹在 pandas 中完成,而无需在 python 代码中编写 SQL 查询吗?

One idea is use DataFrame.merge :一个想法是使用DataFrame.merge

df = df1.merge(df2, on=['Badge Number','Work Date'])

Ane then filter:然后过滤:

df [(df['Company_x'] != df['Company_y']) | (df['Package_x'] != df['Package_y'])]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM