![](/img/trans.png)
[英]Python / Pandas - merging two dataframes based in a non-index column
[英]Merging two Pandas dataframes based column values
我有兩個數據框定義如下:
dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
["123", "RIGHT", "ABC", "DEF", "GHI"],
["123", "LEFT", "ABC", "DEF", "GHI"],
["456", "CTR", "JKL", "MNO", "PQR"],
["456", "RIGHT", "JKL", "MNO", "PQR"],
["456", "LEFT", "JKL", "MNO", "PQR"]],
columns=["ID","LOCATION",
"Attr1", "Attr2", "Attr3"],
)
dataframe2 = pd.DataFrame( [["1", "A", "123"],
["1", "B", "123"],
["1", "C", "123"],
["2", "A", "456"],
["2", "B", "456"],
["2", "C", "456"]],
columns=["ROW","LOCATION","ID"],
)
我想根據 ID 列和 Location 列的值將這兩個數據幀合並到 dataframe 中。 在位置列中,A 等於 CTR,B 等於 RIGHT,C 等於 LEFT。 我正在尋找的結果是這樣的:
ID ROW LOCATION Attr1 Attr2 Attr3
0 123 1 A ABC DEF GHI
1 123 1 B ABC DEF GHI
2 123 1 C ABC DEF GHI
3 456 2 A JKL MNO PQR
4 456 2 B JKL MNO PQR
5 456 2 C JKL MNO PQR
使用 pandas.merge() 我可以使用一個或多個列合並數據幀,但我得到一個 KeyError,因為 Location 列值不匹配。
pandas.merge() 是否正確 function 執行此操作,我如何使用它定義匹配的列值?
map
並使用字典assign
值,然后您可以執行簡單的merge
:
d = {'CTR': 'A', 'RIGHT': 'B', 'LEFT': 'C'}
dataframe2.merge(dataframe1.assign(LOCATION=dataframe1['LOCATION'].map(d)),
on=['ID', 'LOCATION'])
output:
ROW LOCATION ID Attr1 Attr2 Attr3
0 1 A 123 ABC DEF GHI
1 1 B 123 ABC DEF GHI
2 1 C 123 ABC DEF GHI
3 2 A 456 JKL MNO PQR
4 2 B 456 JKL MNO PQR
5 2 C 456 JKL MNO PQR
只是 map 到位置字典,你想要的和 dataframe2 之間沒有關系:
locations = { 'CTR':'A' , 'RIGHT' : 'B', 'LEFT' : 'C'}
dataframe1['LOCATION'] = dataframe1['LOCATION'].map(locations)
print(dataframe1)
您可以合並兩個 dataframe 如下:
dataframe1 = pd.DataFrame( [["123", "CTR", "ABC", "DEF", "GHI"],
["123", "RIGHT", "ABC", "DEF", "GHI"],
["123", "LEFT", "ABC", "DEF", "GHI"],
["456", "CTR", "JKL", "MNO", "PQR"],
["456", "RIGHT", "JKL", "MNO", "PQR"],
["456", "LEFT", "JKL", "MNO", "PQR"]],
columns=["ID","LOCATION",
"Attr1", "Attr2", "Attr3"],
)
dataframe2 = pd.DataFrame( [["1", "A", "123"],
["1", "B", "123"],
["1", "C", "123"],
["2", "A", "456"],
["2", "B", "456"],
["2", "C", "456"]],
columns=["ROW","LOCATION","ID"],
)
result =dataframe1.merge(dataframe2 , left_on='ID', right_on='LOCATION',
suffixes=('_left', '_right'))
display(result)
使用pd.merge添加多個數據框。
The output be like that:
ID ROW LOCATION Attr1 Attr2 Attr3
0 123 1 A ABC DEF GHI
1 123 1 B ABC DEF GHI
2 123 1 C ABC DEF GHI
3 456 2 A JKL MNO PQR
4 456 2 B JKL MNO PQR
5 456 2 C JKL MNO PQR
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.