[英]Python Pandas merge 2 dataframes
我正在嘗試合並2個具有相同信息但分解方式不同的數據框
df1:#net團隊總水平
Team Current Sales Previous Sales Team Total Diff
Blue 10 5 5
Orange 20 8 12
Yellow 40 11 29
df2:#net總計按地區細分
Team Region Curr Sales Prev Sales Net Diff
Blue East 4 4 0
Blue West 6 1 5
Orange East 6 3 3
Orange West 14 5 9
Yellow East 15 3 12
Yellow West 25 8 17
合並數據框:
Team Region Curr Sales Previ Sales Net Diff Team Total Diff
Blue East 4 4 0 5
Blue West 6 1 5 5
Orange East 6 3 3 12
Orange West 14 5 9 12
Yellow East 15 3 12 29
Yellow West 25 8 17 29
我這樣做是為了可以在新的列中執行其他統計功能,但是我不確定如何將兩者合並。 如果我將df1 ['Team Total Diff']添加到df2,則它將填充前3條記錄,並且不會填寫每個團隊的名稱。
如果我使用以下合並功能,則看不到任何更改:
df2.merge(df1[['team_sort', 'Team']], how='inner', on='Team')
'team_sort'用作索引,以保持基於Net Team Diff升序排列的團隊
任何幫助,將不勝感激
您可以在此情景中使用map
:
df2['Team Total Diff'] = df2['Team'].map(df1.set_index('Team')['Team Total Diff'])
df2
輸出:
Team Region Curr Sales Prev Sales Net Diff Team Total Diff
0 Blue East 4 4 0 5
1 Blue West 6 1 5 5
2 Orange East 6 3 3 12
3 Orange West 14 5 9 12
4 Yellow East 15 3 12 29
5 Yellow West 25 8 17 29
merge
是正確的方法,但是您使用的是錯誤的方法。 試試看:
merged_df = df2.merge(df1[['Team', 'Team Total Diff']], on=['Team'])
這是因為merge
,像大多數方法的DataFrame
,實際上產生一個新的DataFrame
對象,而不是改變self
。
索引的處理方式可能會有些棘手,因此我通常只在合並數據幀之前重設索引。
我認為應該這樣做:
merged_df = pd.merge(df1, df2, how=right, left_on="Team", right_on="Team")
merged_df = pd.concat([df1,df2], join='inner')
join
默認設置是external,所以請嘗試inner
。 如果不工作做outer
merged_df = pd.concat([df1,df2], join='outer')
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.