[英]Pandas - Find values between dataframes and add values of a series for the matched
I am trying to search for the values of DataframeB.ColA in DataframeA.IDCol2 and then create a dataframe with the DataframeA.IDCol1 and the sum of DataframeA.IDCol3 for the values that were matched. 我正在尝试在DataframeA.IDCol2中搜索DataframeB.ColA的值,然后使用DataframeA.IDCol1和DataframeA.IDCol3的总和来创建一个匹配的值。
DataframeA 数据框A
IDCol1 IDCol2 IDCol3
0 ABC 123 2
1 ABC 456 5
2 ABC 789 2
3 ABC 1011 1
4 CDE 123 3
5 CDE 456 2
6 CDE CCC 4
7 CDE AAA 1
DataframeB 数据框B
ColA
0 123
1 456
2 CCC
3 1011
Output 输出量
Col Sum
0 ABC 8
1 CDE 9
Use DataFrame.merge
first and then aggregate sum
: 首先使用
DataFrame.merge
,然后汇总sum
:
df = (DataframeA.merge(DataframeB, left_on='IDCol2', right_on='ColA')
.groupby('IDCol1', as_index=False)['IDCol3']
.sum())
print (df)
IDCol1 IDCol3
0 ABC 8
1 CDE 9
Another solution: 另一个解决方案:
s = DataframeB['ColA']
df = DataframeA.set_index('IDCol1').query('IDCol2 in @s')['IDCol3'].sum(level=0).reset_index()
print (df)
IDCol1 IDCol3
0 ABC 8
1 CDE 9
Use series.isin()
and groupby()
with sum
: 将
series.isin()
和groupby()
与sum
:
dfA[dfA.IDCol2.isin(dfB.ColA)].groupby('IDCol1')['IDCol3'].sum().reset_index(name='Sum')
IDCol1 Sum
0 ABC 8
1 CDE 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.