[英]Combining rows on a Dataframe based on a specific column value and add other values
[英]Add specific column values based on other Dataframe
我有第一個數據框
df1:
A B C D
Car 0
Bike 0
Train 0
Plane 0
Other_1 Plane 2
Other_2 Plane 3
Other 3 Plane 4
而另一個:
df2:
A B
Car 4 %
Bike 5 %
Train 6 %
Plane 7 %
所以我想得到這個組合:
df1:
A B C D
Car 0 4 %
Bike 0 5 %
Train 0 6 %
Plane 0 7 %
Other_1 Plane 2 2
Other_2 Plane 3 3
Other 3 Plane 4 4
哪個是最好的方法?
如果df和df2的索引相同,則可以使用:
df['D'] = df2['B'].combine_first(df['C'])
輸出:
A B C D
0 Car NaN 0 4 %
1 Bike NaN 0 5 %
2 Train NaN 0 6 %
3 Plane NaN 0 7 %
4 Other_1 Plane 2 2
5 Other_2 Plane 3 3
6 Other_3 Plane 4 4
如果索引不同,則可以在列A上使用merge
:
df_out = df.merge(df2, on ='A', how='left', suffixes=('','y'))
df_out.assign(D = df_out.By.fillna(df_out.C)).drop('By', axis=1)
或使用@piRSquared改進的單線 :
df.drop('D',1).merge(df2.rename(columns={'B':'D'}), how='left',on ='A')
輸出:
A B C D
0 Car NaN 0 4 %
1 Bike NaN 0 5 %
2 Train NaN 0 6 %
3 Plane NaN 0 7 %
4 Other_1 Plane 2 2
5 Other_2 Plane 3 3
6 Other_3 Plane 4 4
map
df1.assign(D=df1.A.map(dict(zip(df2.A, df2.B))))
A B C D
0 Car NaN 0 4 %
1 Bike NaN 0 5 %
2 Train NaN 0 6 %
3 Plane NaN 0 7 %
4 Other_1 Plane 2 NaN
5 Other_2 Plane 3 NaN
6 Other_3 Plane 4 NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.