[英]Merge pandas dataframe on a combination of conditions with AND and OR not Equal
給定:下面的兩個數據框
df1:
| Company | Package | Badge Number | Work Date |
|----------|---------|--------------|------------|
| Compnay1 | X | 1 | 2020-01-01 |
| Company2 | X | 2 | 2020-01-01 |
df2:
| Company | Package | Badge Number | Work Date |
|----------|---------|--------------|------------|
| Compnay1 | X | 1 | 2020-01-01 |
| Compnay1 | Y | 1 | 2020-01-01 |
| Company2 | X | 1 | 2020-01-01 |
| Company2 | Y | 1 | 2020-01-01 |
| Company2 | X | 2 | 2020-01-01 |
需要什么:我需要編寫類似於此 SQL 語句的 python 代碼。
SELECT *
FROM df1
INNER JOIN df2
ON df1.[Badge Number] = df2.[Badge Number]
AND df1.[Work Date] = df2.[Work Date]
AND (df1.[Company] != df2.[Company] OR df1.[Package] != df2.[Package])
結果:
| df1.Company | df1.Package | df1.Badge Number | df1.Work Date | df2.Company | df2.Package | df2.Badge Number | df2.Work Date |
|-------------|-------------|------------------|---------------|-------------|-------------|------------------|---------------|
| Compnay1 | X | 1 | 2020-01-01 | Compnay1 | Y | 1 | 2020-01-01 |
| Compnay1 | X | 1 | 2020-01-01 | Company2 | X | 1 | 2020-01-01 |
| Compnay1 | X | 1 | 2020-01-01 | Company2 | Y | 1 | 2020-01-01 |
這可以純粹在 pandas 中完成,而無需在 python 代碼中編寫 SQL 查詢嗎?
一個想法是使用DataFrame.merge
:
df = df1.merge(df2, on=['Badge Number','Work Date'])
然后過濾:
df [(df['Company_x'] != df['Company_y']) | (df['Package_x'] != df['Package_y'])]
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.