I have a dataframe, df:
Location | Category | Species | Number | x | y |
---|---|---|---|---|---|
a | Duiker | P monticola | 3 | 9.1 | -2.1 |
a | Duiker | C callipygus | 6 | 9.1 | -2.1 |
b | Duiker | C callipygus | 4 | 9.2 | -2.2 |
b | Carnivore | G servalina | 2 | 9.2 | -2.2 |
b | Carnivore | G maculata | 3 | 9.2 | -2.2 |
c | Carnivore | C aurata | 1 | 9.3 | -2.3 |
I want to convert df into the following format with these conditions:
So that df:
Location | Category | Number | x | y |
---|---|---|---|---|
a | Duiker | 9 | 9.1 | -2.1 |
b | Duiker | 4 | 9.2 | -2.2 |
b | Carnivore | 5 | 9.2 | -2.2 |
c | Carnivore | 1 | 9.3 | -2.3 |
My current method would be to:
df1 = df[['Location', 'Category', 'Number']].copy()
df2 = df[['Location', 'x', 'y']].copy()
df1 = df1.groupby('Location', 'Category']).sum()
df3 = pd.merge(df1, df2, how = 'inner', on = ['Location'])
Instead on step-3. I get a df3 where the Category column is removed and the Locations are no longer grouped, like this:
Location | Number | x | y |
---|---|---|---|
a | 9 | 9.1 | -2.1 |
a | 9 | 9.1 | -2.1 |
b | 4 | 9.2 | -2.2 |
b | 4 | 9.2 | -2.2 |
c | 1 | 9.3 | -2.3 |
I'm a bit lazy and a bit stuck, can someone throw me a bone? and perhaps make my coding more efficient in the process. Thank you in advance.
Specify the columns and functions in agg
:
sum
the "Number" columnfirst
values in columns "x" and "y">>> df.groupby(["Location","Category"],as_index=False).agg({"Number":"sum","x":"first","y":"first"})
Location Category Number x y
0 a Duiker 9 9.1 -2.1
1 b Carnivore 5 9.2 -2.2
2 b Duiker 4 9.2 -2.2
3 c Carnivore 1 9.3 -2.3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.