简体   繁体   中英

Sorting and Filtering Pandas Dataframe

I'm trying to filter and sort a Pandas dataframe to clean my data. I've looked on StackOverflow and can't seem to find a method that will give me the sort and filter I need. The data I'm working with looks something like this:

| Name 1 | Name 2 | Score |
| ------ | ------ | ----- | 
| Amy | Jack | 2.456 | 
| Amy | Jack | 3.234 | 
| Amy | Jack | 5.124 | 
| ... | ... | ... | 
| Max | Jane | 8.569 |
| Max | Jane | 4.654 |
| Max | Jane | 6.349 |

What I want to do make a new dataframe out of the lowest score of every pair of names. So the resulting dataframe would be something like this:

| Name 1 | Name 2 | Score |
| ------ | ------ | ----- | 
| Amy | Jack | 2.456 | 
| ... | ... | ...|
| Max | Jane | 4.654 | 

Use:

df = df.groupby(['Name 1', 'Name 2'], as_index = False).agg(Score = ('Score', 'min'))

Output:

>>> df
  Name1 Name2  Score
0   Amy  Jack  2.456
1   Max  Jane  4.654

You can also use sort_values() and groupby() method:

df.sort_values(by='Score').groupby(['Name 1', 'Name 2'], as_index = False).first()

OR

Use sort_values() and drop_duplicates() method:

df.sort_values(by='Score').drop_duplicates(subset=['Name 1', 'Name 2'])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM