I have the following dataframe df
:
topic num
0 a01 1
1 a01 1
2 a01 2
3 a02 1
4 a02 3
5 a02 2
6 a02 3
7 a03 2
8 a03 1
And I need to create a new dataframe newdf
, where each row corresponds to the topic and the maximum number for each topic, like the following:
topic num
0 a01 2
1 a02 3
2 a03 2
I've tried to use the max() function from pandas, but to no avail. What I don't seem to get is how I'm gonna iterate through each row and find the highest value correspondent to the topic. How do I separate a01 from a02, so that I can get the maximum value for each? I've also tried transposing, but the same doubt keeps appearing.
See Get the row(s) which have the max value in groups using groupby
Example:
new_df = df.groupby(['topic'], sort=False)['num'].max()
You can use GroupBy.max
with numeric_only=True
:
newdf= df.groupby("topic", as_index=False).max(numeric_only=True)
print(newdf)
topic num
0 a01 2
1 a02 3
2 a03 2
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.