简体   繁体   English

分组并根据 Pandas 中的另一列对一列进行降序排名

[英]Groupby and descendingly rank one column based on another one in Pandas

For the following example data frame, I'm working at grouby class and descendingly ranking the score .对于下面的示例数据帧,我在grouby工作class和递减排名score

    stu_id class    name  score
0        1     A    Jack     45
1        2     A   Oscar     75
2        3     B   Emile     60
3        4     B  Sophie     64
4        5     B     Jim     85
5        6     A  Thomas     55
6        7     A   David     60
7        8     B     Lee     60
8        9     B   Elvis     70
9       10     A   Frank     75
10      11     A   James     90

I have tried:我试过了:

df['rank'] = df.groupby(['class'])['score'].rank(ascending=True)
df

Result:结果:

    stu_id class    name  score  rank
0        1     A    Jack     45   1.0
1        2     A   Oscar     75   4.5
2        3     B   Emile     60   1.5
3        4     B  Sophie     64   3.0
4        5     B     Jim     85   5.0
5        6     A  Thomas     55   2.0
6        7     A   David     60   3.0
7        8     B     Lee     60   1.5
8        9     B   Elvis     70   4.0
9       10     A   Frank     75   4.5
10      11     A   James     90   6.0

But my expected output should like this, why my code doesn't work out?但是我的预期输出应该是这样的,为什么我的代码不起作用? Thanks.谢谢。

    stu_id class    name  score  rank
0        1     A    Jack     45     1
1        2     A   Oscar     75     4
2        3     B   Emile     60     1
3        4     B  Sophie     64     2
4        5     B     Jim     85     4
5        6     A  Thomas     55     2
6        7     A   David     60     3
7        8     B     Lee     60     1
8        9     B   Elvis     70     3
9       10     A   Frank     75     4
10      11     A   James     90     5

method='dense'

The default ranking uses average to resolve ties.默认排名使用average来解决平局。 In group A, Oscar and Frank share the same score, which is related to ranks 4 and 5. Under 'average' logic, both get set to 4.5: (4+5)/2, and the next value is ranked 6 so long as there are no ties with it, which is the case for James.在A组中,奥斯卡和弗兰克的得分相同,这与排名4和5有关。在'average'逻辑下,两者都设置为4.5:(4+5)/2,那么下一个值排在第6位因为与它没有联系,詹姆斯就是这种情况。 With 'dense' , the ties are given the lower rank ( 4 in this case) then the next distinct value continues the ranking at 5.使用'dense' ,关系被赋予较低的等级(在这种情况下为4 ),然后下一个不同的值在 5 处继续排名。

df['rank'] = df.groupby(['class'])['score'].rank(method='dense').astype(int)

    stu_id class    name  score  rank
0        1     A    Jack     45     1
1        2     A   Oscar     75     4
2        3     B   Emile     60     1
3        4     B  Sophie     64     2
4        5     B     Jim     85     4
5        6     A  Thomas     55     2
6        7     A   David     60     3
7        8     B     Lee     60     1
8        9     B   Elvis     70     3
9       10     A   Frank     75     4
10      11     A   James     90     5

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM