簡體   English   中英

Python Pandas合並了2個數據框

[英]Python Pandas merge 2 dataframes

我正在嘗試合並2個具有相同信息但分解方式不同的數據框

df1:#net團隊總水平

Team    Current Sales    Previous Sales    Team Total Diff
Blue    10               5                 5
Orange  20               8                 12
Yellow  40               11                29

df2:#net總計按地區細分

Team    Region    Curr Sales    Prev Sales    Net Diff
Blue    East      4             4             0
Blue    West      6             1             5
Orange  East      6             3             3
Orange  West      14            5             9
Yellow  East      15            3             12
Yellow  West      25            8             17

合並數據框:

Team    Region    Curr Sales    Previ Sales    Net Diff   Team Total Diff
Blue    East      4             4              0           5
Blue    West      6             1              5           5
Orange  East      6             3              3           12
Orange  West      14            5              9           12 
Yellow  East      15            3              12          29
Yellow  West      25            8              17          29

我這樣做是為了可以在新的列中執行其他統計功能,但是我不確定如何將兩者合並。 如果我將df1 ['Team Total Diff']添加到df2,則它將填充前3條記錄,並且不會填寫每個團隊的名稱。

如果我使用以下合並功能,則看不到任何更改:

df2.merge(df1[['team_sort', 'Team']], how='inner', on='Team')

'team_sort'用作索引,以保持基於Net Team Diff升序排列的團隊

任何幫助,將不勝感激

您可以在此情景中使用map

df2['Team Total Diff'] = df2['Team'].map(df1.set_index('Team')['Team Total Diff'])
df2

輸出:

     Team Region  Curr Sales  Prev Sales  Net Diff  Team Total Diff
0    Blue   East           4           4         0                5
1    Blue   West           6           1         5                5
2  Orange   East           6           3         3               12
3  Orange   West          14           5         9               12
4  Yellow   East          15           3        12               29
5  Yellow   West          25           8        17               29

merge是正確的方法,但是您使用的是錯誤的方法。 試試看:

merged_df = df2.merge(df1[['Team', 'Team Total Diff']], on=['Team'])

這是因為merge ,像大多數方法的DataFrame ,實際上產生一個新的DataFrame對象,而不是改變self

索引的處理方式可能會有些棘手,因此我通常只在合並數據幀之前重設索引。

我認為應該這樣做:

merged_df = pd.merge(df1, df2, how=right, left_on="Team", right_on="Team")
merged_df = pd.concat([df1,df2], join='inner')

join默認設置是external,所以請嘗試inner 如果不工作做outer

merged_df = pd.concat([df1,df2], join='outer')

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM