簡體   English   中英

合並兩個基於 Pandas 數據框的列值

[英]Merging two Pandas dataframes based column values

我有兩個數據框定義如下:

dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
                            ["123", "RIGHT", "ABC", "DEF", "GHI"],
                            ["123", "LEFT", "ABC", "DEF", "GHI"],
                            ["456", "CTR", "JKL", "MNO", "PQR"],
                            ["456", "RIGHT", "JKL", "MNO", "PQR"],
                            ["456", "LEFT", "JKL", "MNO", "PQR"]],
    columns=["ID","LOCATION",
            "Attr1", "Attr2", "Attr3"],
    )

dataframe2 = pd.DataFrame( [["1", "A", "123"],
                            ["1", "B", "123"],
                            ["1", "C", "123"],
                            ["2", "A", "456"],
                            ["2", "B", "456"],
                            ["2", "C", "456"]],
    columns=["ROW","LOCATION","ID"],
    )

我想根據 ID 列和 Location 列的值將這兩個數據幀合並到 dataframe 中。 在位置列中,A 等於 CTR,B 等於 RIGHT,C 等於 LEFT。 我正在尋找的結果是這樣的:

    ID ROW LOCATION Attr1 Attr2 Attr3
0  123   1        A   ABC   DEF   GHI
1  123   1        B   ABC   DEF   GHI
2  123   1        C   ABC   DEF   GHI
3  456   2        A   JKL   MNO   PQR
4  456   2        B   JKL   MNO   PQR
5  456   2        C   JKL   MNO   PQR

使用 pandas.merge() 我可以使用一個或多個列合並數據幀,但我得到一個 KeyError,因為 Location 列值不匹配。

pandas.merge() 是否正確 function 執行此操作,我如何使用它定義匹配的列值?

map並使用字典assign值,然后您可以執行簡單的merge

d = {'CTR': 'A', 'RIGHT': 'B', 'LEFT': 'C'}

dataframe2.merge(dataframe1.assign(LOCATION=dataframe1['LOCATION'].map(d)),
                 on=['ID', 'LOCATION'])

output:

  ROW LOCATION   ID Attr1 Attr2 Attr3
0   1        A  123   ABC   DEF   GHI
1   1        B  123   ABC   DEF   GHI
2   1        C  123   ABC   DEF   GHI
3   2        A  456   JKL   MNO   PQR
4   2        B  456   JKL   MNO   PQR
5   2        C  456   JKL   MNO   PQR

只是 map 到位置字典,你想要的和 dataframe2 之間沒有關系:

locations = { 'CTR':'A' , 'RIGHT' : 'B', 'LEFT' : 'C'}
dataframe1['LOCATION'] = dataframe1['LOCATION'].map(locations)

print(dataframe1)

您可以合並兩個 dataframe 如下:

 dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
                                ["123", "RIGHT", "ABC", "DEF", "GHI"],
                                ["123", "LEFT", "ABC", "DEF", "GHI"],
                                ["456", "CTR", "JKL", "MNO", "PQR"],
                                ["456", "RIGHT", "JKL", "MNO", "PQR"],
                                ["456", "LEFT", "JKL", "MNO", "PQR"]],
        columns=["ID","LOCATION",
                "Attr1", "Attr2", "Attr3"],
        )
    
dataframe2 = pd.DataFrame( [["1", "A", "123"],
                            ["1", "B", "123"],
                            ["1", "C", "123"],
                            ["2", "A", "456"],
                            ["2", "B", "456"],
                            ["2", "C", "456"]],
    columns=["ROW","LOCATION","ID"],
    )


result =dataframe1.merge(dataframe2 , left_on='ID', right_on='LOCATION',
          suffixes=('_left', '_right'))
    display(result)

使用pd.merge添加多個數據框。

The output be like that:
    ID ROW LOCATION Attr1 Attr2 Attr3
0  123   1        A   ABC   DEF   GHI
1  123   1        B   ABC   DEF   GHI
2  123   1        C   ABC   DEF   GHI
3  456   2        A   JKL   MNO   PQR
4  456   2        B   JKL   MNO   PQR
5  456   2        C   JKL   MNO   PQR

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM