简体   繁体   中英

Python pandas: group by two columns, sum on one but not on the others

I have a dataframe, df:

Location Category Species Number x y
a Duiker P monticola 3 9.1 -2.1
a Duiker C callipygus 6 9.1 -2.1
b Duiker C callipygus 4 9.2 -2.2
b Carnivore G servalina 2 9.2 -2.2
b Carnivore G maculata 3 9.2 -2.2
c Carnivore C aurata 1 9.3 -2.3

I want to convert df into the following format with these conditions:

  1. remove the Species column
  2. group by Category AND Location
  3. in grouping, aggregate Number values
  4. but do not aggregate columns x & y (same for any Location)

So that df:

Location Category Number x y
a Duiker 9 9.1 -2.1
b Duiker 4 9.2 -2.2
b Carnivore 5 9.2 -2.2
c Carnivore 1 9.3 -2.3

My current method would be to:

  1. split df into df1 and df2, where
df1 = df[['Location', 'Category', 'Number']].copy()
df2 = df[['Location', 'x', 'y']].copy()
  1. group and sum df1 by Location and Category
df1 = df1.groupby('Location', 'Category']).sum()
  1. Merge intersect df1 and df2 on Location and Category
df3 = pd.merge(df1, df2, how = 'inner', on = ['Location'])

Instead on step-3. I get a df3 where the Category column is removed and the Locations are no longer grouped, like this:

Location Number x y
a 9 9.1 -2.1
a 9 9.1 -2.1
b 4 9.2 -2.2
b 4 9.2 -2.2
c 1 9.3 -2.3

I'm a bit lazy and a bit stuck, can someone throw me a bone? and perhaps make my coding more efficient in the process. Thank you in advance.

Specify the columns and functions in agg :

  • sum the "Number" column
  • keep the first values in columns "x" and "y"
>>> df.groupby(["Location","Category"],as_index=False).agg({"Number":"sum","x":"first","y":"first"})

  Location   Category  Number    x    y
0        a     Duiker       9  9.1 -2.1
1        b  Carnivore       5  9.2 -2.2
2        b     Duiker       4  9.2 -2.2
3        c  Carnivore       1  9.3 -2.3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM