Pandas 合並兩個數據框並刪除額外的行

Question

如何僅在“sample_id”上合並/加入這兩個數據幀，並在合並/加入時從第二個數據幀中刪除額外的行？

在 Python 中使用熊貓。

第一個數據框 (fdf)

| sample_id | name  |
|-----------|-------|
| 1         | Mark  |
| 1         | Dart  |
| 2         | Julia |
| 2         | Oolia |
| 2         | Talia |

第二個數據框 (sdf)

| sample_id | salary | time |
|-----------|--------|------|
| 1         | 20     | 0    |
| 1         | 30     | 5    |
| 1         | 40     | 10   |
| 1         | 50     | 15   |
| 2         | 33     | 0    |
| 2         | 23     | 5    |
| 2         | 24     | 10   |
| 2         | 28     | 15   |
| 2         | 29     | 20   |

所以產生的 df 會像 -

| sample_id | name  | salary | time |
|-----------|-------|--------|------|
| 1         | Mark  | 20     | 0    |
| 1         | Dart  | 30     | 5    |
| 2         | Julia | 33     | 0    |
| 2         | Oolia | 23     | 5    |
| 2         | Talia | 24     | 10   |

Answer 1

有重復，所以需要幫助列正確的DataFrame.merge和GroupBy.cumcount作為計數器：

df = (fdf.assign(g=fdf.groupby('sample_id').cumcount())
        .merge(sdf.assign(g=sdf.groupby('sample_id').cumcount()), on=['sample_id', 'g'])
        .drop('g', axis=1))
print (df)
   sample_id   name  salary  time
0          1   Mark      20     0
1          1   Dart      30     5
2          2  Julia      33     0
3          2  Oolia      23     5
4          2  Talia      24    10

Answer 2

final_res = pd.merge(df,df2,on=['sample_id'],how='left')
final_res.sort_values(['sample_id','name','time'],ascending=[True,True,True],inplace=True)

final_res.drop_duplicates(subset=['sample_id','name'],keep='first',inplace=True)

Pandas 合並兩個數據框並刪除額外的行

問題描述

2 個解決方案

解決方案1
0 已采納 2019-07-19 07:48:27

解決方案2
0 2019-07-19 07:56:07

Pandas 合並兩個數據框並刪除額外的行

問題描述

2 個解決方案

解決方案1 0 已采納 2019-07-19 07:48:27

解決方案2 0 2019-07-19 07:56:07

解決方案1
0 已采納 2019-07-19 07:48:27

解決方案2
0 2019-07-19 07:56:07