簡體   English   中英

Python大熊貓從左表獲取行,並從左表獲取右表缺少的行

[英]Python pandas get rows from left table and from right table missing in left table

我有左右表,我需要以這種方式合並兩個表的FileStamp值:取左表和左表中缺少的右表中的所有值,並按'date'聯接:

import pandas as pd
left = pd.DataFrame({'FileStamp': ['T101', 'T102', 'T103', 'T104'], 'date': [20180101, 20180102, 20180103, 20180104]})
right = pd.DataFrame({'FileStamp': ['T501', 'T502'], 'date': [20180104, 20180105]})

就像是

result = pd.merge(left, right, how='outer', on='date')

但是“外面”不是一個好主意。

所需的輸出應如下所示

     FileStamp_x      date      FileStamp_y
0        T101       20180101         NaN
1        T102       20180102         NaN
2        T103       20180103         NaN
3        T104       20180104         NaN
4         NaN       20180105        T502

有什么簡單的方法可以實現所需的輸出嗎?

merge之前使用isin進行過濾:

r = right[~right['date'].isin(left['date'])]
print (r)
  FileStamp      date
1      T502  20180105

result = pd.merge(left, r, how='outer', on='date')
print (result)
  FileStamp_x      date FileStamp_y
0        T101  20180101         NaN
1        T102  20180102         NaN
2        T103  20180103         NaN
3        T104  20180104         NaN
4         NaN  20180105        T502

您可以在merge后調整值:

result = pd.merge(left, right, how='outer', on='date')
result['FileStamp_y'] = np.where(result['FileStamp_x'].isnull(), result['FileStamp_y'], np.nan)

結果:

    FileStamp_x     date  FileStamp_y
0          T101 20180101          NaN
1          T102 20180102          NaN
2          T103 20180103          NaN
3          T104 20180104          NaN
4           NaN 20180105         T502

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM