简体   繁体   中英

Create new dataframe from the highest values in a column

I have the following dataframe df :

    topic   num
0   a01     1
1   a01     1
2   a01     2
3   a02     1
4   a02     3
5   a02     2
6   a02     3
7   a03     2
8   a03     1

And I need to create a new dataframe newdf , where each row corresponds to the topic and the maximum number for each topic, like the following:

    topic   num
0   a01     2
1   a02     3
2   a03     2

I've tried to use the max() function from pandas, but to no avail. What I don't seem to get is how I'm gonna iterate through each row and find the highest value correspondent to the topic. How do I separate a01 from a02, so that I can get the maximum value for each? I've also tried transposing, but the same doubt keeps appearing.

See Get the row(s) which have the max value in groups using groupby

Example:

new_df = df.groupby(['topic'], sort=False)['num'].max()

You can use GroupBy.max with numeric_only=True :

newdf= df.groupby("topic", as_index=False).max(numeric_only=True)

Output:

print(newdf)

  topic  num
0   a01    2
1   a02    3
2   a03    2

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM