[英]Insert rows into a dataframe by group and entry is from another dataframe_complex match
[英]insert dataframe into rows for each group in another dataframe
為了清楚起見,我創建了 MRE。
df = pd.DataFrame({
"region": ["Canada", "Korea", "Norway", "China", "Canada", "Korea", "Norway", "China", "Canada", "Korea", "Norway", "China"],
"type" :["A", "B", "C", "D", "A", "C", "C", "A", "B", "B", "B", "B"],
"actual fees": [1235, 422, 333, 111, 1233, 555, 23, 3, 3,4, 1, 2],
"total fee": [2222, 444, 67, 711, 4873, 785, 453, 7, 7,9, 11, 352]
})
df_to_insert = pd.DataFrame({
"region":["Canada", "Korea", "Norway", "China"],
"users" :[55, 36, 87, 250]
})
所以我的 df 看起來像:
actual fees total fee
region type
Canada A 2 2
B 1 1
China A 1 1
B 1 1
D 1 1
和 df_to_insert 如下所示:
region users
0 Canada 55
1 Korea 36
2 Norway 87
3 China 250
現在我想要做的是在“類型”列中每個區域的末尾插入“用戶”和用戶值在“實際費用”列和“總費用”列下其區域總和。
所以我想要的數據框看起來像下面這樣:
actual fees total fee
region type
Canada A 2 2
B 1 1
Users 55 3
China A 1 1
B 1 1
D 1 1
Users 250 3
我希望這已經足夠清楚了。 如果有不清楚的地方,請告訴我。
提前致謝!
您可以先melt
df_to_insert
,然后為MultiIndex
進行concat
和set_index
,最后是total fee
、groupby 區域並映射回mlt
數據幀
mlt = df_to_insert.melt('region',var_name='type',value_name='actual fees')
mlt['total fee'] = mlt['region'].map(df.groupby('region')['total fee'].sum())
out = pd.concat((df,mlt),sort=False).set_index(['region','type']).sort_index(0)
print(out)
actual fees total fee
region type
Canada A 1235 2222
A 1233 4873
B 3 7
users 55 7102
China A 3 7
B 2 352
D 111 711
users 250 1070
Korea B 422 444
B 4 9
C 555 785
users 36 1238
Norway B 1 11
C 333 67
C 23 453
users 87 531
您可以看到熔體如何工作並有助於連接:
print(df_to_insert.melt('region',var_name='type',value_name='actual fees'))
region type actual fees
0 Canada users 55
1 Korea users 36
2 Norway users 87
3 China users 250
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.