簡體   English   中英

Python - Pandas - 查找兩個數據幀之間的匹配項

[英]Python - Pandas - finding matches between two data frames

假設我有 2 個 pandas 數據幀,它們共享相同的列名,如下所示:

    name:       dob:       role:
James Franco   1-1-1980    Actor
Cameron Diaz   4-2-1976    Actor
Jim Carey      12-1-1968   Actor
Miley Cyrus    5-23-1987   Actor


    name:       dob:       role:
50 cent       4-6-1984     Singer
lil baby      12-1-1990    Singer
ghostmane     8-10-1989    Singer
Miley Cyrus   5-23-1987    Singer

假設我想識別具有相同姓名和出生日期的個人,並且存在於兩個數據框中(因此,有兩個不同的角色)。

我怎樣才能做到這一點?

類似於如果一切都存在於 1 dataframe 中,我做了一個 df.groupby(["name", "dob"]).count())

我希望能夠識別這些人,打印它們,並計算出現次數。

謝謝

df2=df.append(df1)#append the two dfs
dfnew=df2[df2.duplicated(subset=['name:',"dob:"], keep=False)]#keep all duplicated on the columns you wires to check

好吧,這將為您提供匹配項:

df1.merge(df2, on=["name:","dob:",])

output:

         name:       dob: role:_x role:_y
0  Miley Cyrus  5-23-1987   Actor  Singer

您可以使用外部聯接來獲取所有結果並根據需要過濾它們:

df1.merge(df2, how="outer", on=["name:","dob:",])

Output:

          name:       dob: role:_x role:_y
0  James Franco   1-1-1980   Actor     NaN
1  Cameron Diaz   4-2-1976   Actor     NaN
2     Jim Carey  12-1-1968   Actor     NaN
3   Miley Cyrus  5-23-1987   Actor  Singer
4       50 cent   4-6-1984     NaN  Singer
5      lil baby  12-1-1990     NaN  Singer
6     ghostmane  8-10-1989     NaN  Singer

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM