简体   繁体   English

Pandas:基于不同数据帧中的组的一个数据帧中的值的总和

[英]Pandas: sum of values in one dataframe based on the group in a different dataframe

I have a dataframe such contains companies with their sectors我有一个数据框,其中包含公司及其部门

  Symbol             Sector
0    MCM             Industrials
1    AFT             Health Care
2    ABV             Health Care
3    AMN             Health Care
4    ACN  Information Technology

I have another dataframe that contains companies with their positions我有另一个数据框,其中包含公司及其职位

  Symbol  Position
0    ABC  1864817
1    AAP -3298989
2    ABV -1556626
3    AXC  2436387
4    ABT   878535 

What I want is to get a dataframe that contains the aggregate positions for sectors.我想要的是获得一个包含扇区聚合位置的数据框。 So sum the positions of all the companies in a given sector.因此,总结给定部门中所有公司的头寸。 I can do this individually by我可以通过以下方式单独执行此操作

df2[df2.Symbol.isin(df1.groupby('Sector').get_group('Industrials')['Symbol'].to_list())]  

I am looking for a more efficient pandas approach to do this rather than looping over each sector under the group_by.我正在寻找一种更有效的熊猫方法来做到这一点,而不是遍历 group_by 下的每个扇区。 The final dataframe should look like the following:最终的数据框应如下所示:

     Sector                  Sum Position
0    Industrials             14567232
1    Health Care            -329173249
2    Information Technology -65742234
3    Energy                  6574352342
4    Pharma                  6342387658

Any help is appreciated.任何帮助表示赞赏。

If I understood the question correctly, one way to do it is joining both data frames and then group by sector and sum the position column, like so:如果我正确理解了这个问题,一种方法是连接两个数据框,然后按扇区分组并对位置列求和,如下所示:

df_agg = df1.join(df2['Position']).drop('Symbol', axis=1)
df_agg.groupby('Sector').sum()

Where, df1 is the df with Sectors and df2 is the df with Positions.其中,df1 是带扇区的 df,而 df2 是带位置的 df。

您可以将符号列map到扇区并使用该系列进行分组。

df2.groupby(df2.Symbol.map(df1.set_index('Symbol').Sector)).Position.sum()

让我们做merge

df2.merge(df1,how='left').groupby('Sector').Position.sum()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas数据框-基于组的每一列的总和 - Pandas dataframe - sum of each column based on group 如何根据不同列中的值向 pandas dataframe 添加一列? - How to add one column to pandas dataframe based on values in different columns? 根据计算对Pandas DataFrame中的值进行分组 - Group values in Pandas DataFrame based on calculations Pandas:根据不同dataframe的多列中的匹配值,在一个dataframe中创建一列 - Pandas: create a column in one dataframe based on matching values in multiple columns of a different dataframe pandas数据框根据另一数据框中的值将值追加到一列 - pandas dataframe append values to one column based on the values in another dataframe 如何按pandas数据框中的值组找到n个最大值的总和? - how to find the sum of the n largest values by group of values in a pandas dataframe? 根据第二个日期范围 dataframe 中的总和值 dataframe - Sum values in one dataframe based on date range in a second dataframe 根据不同数据框中的匹配值,将摘要列添加到pandas数据框中 - Add summary columns to a pandas dataframe based on matching values in a different dataframe 如何在熊猫数据框中按不同来源分组并求和? - How to group by and make sum from different sources in a pandas dataframe? 根据第二个 dataframe 中的行设置 Pandas 一个 dataframe 中的值 - Set values in Pandas one dataframe based on rows in second dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM