简体   繁体   中英

Sum the values of a series in pandas based on one of multiple keys?

I'm working with pandas in python, and I have a pandas Series object, that I can't for the life of me figure out. it essentially looks like this:

>>>print(series_object)

key1              key2      key3                                                             
First class       19438     Error1:0       117
                  16431     Error2:0       80
                  1         Error3:0       70
Second class      28039     Error4:0       65
Third class       2063      Error5:0       28
                  19439     Error6:0       25
Fourth class      25975     Error7:0       11
Fifth class       23111     Error8:0       7
                  1243      Error9:665     4
                            Error9:581     3
                  27525     Error10:0      3
                  1243      Error9:748     2
                  1247      Error11:65     2
                  1243      Error9:852     2
                  1247      Error11:66     2
                            Error11:70     1
                            Error11:95     1
                            Error11:181    1
                            Error11:102    1
                            Error11:160    1

I want a way to sum the values of this object where key2 matches, so that it changes series_object to be:

>>>print(series_object)
key1              key2      key3                                                             
First class       19438     Error1:0       117
                  16431     Error2:0       80
                  1         Error3:0       70
Second class      28039     Error4:0       65
Third class       2063      Error5:0       28
                  19439     Error6:0       25
Fourth class      25975     Error7:0       11
Fifth class       23111     Error8:0       7
                  1243      Error9:665     11
                  27525     Error10:0      3
                  1247      Error11:65     9

I've tried a lot of different things, and in a normal array, this wouldn't be an issue for me, but the pandas series object is new and confusing me. Could anyone provide some help?

You can use groupby.

http://pandas.pydata.org/pandas-docs/stable/groupby.html#groupby-with-multiindex

In your case

series_object.groupby(level='key2').sum()

Or if you want to keep 'key1' information as well

series_object.groupby(level=['key1', 'key2']).sum()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM