[英]Order Pandas dataframe groups by minimum index number, then re-order all other columns within groups based on a 3rd column
I have a dataframe like this: 我有一个这样的数据框:
import numpy as np
import pandas as pd
columns=['Order', 'Group_code', 'Grade', 'Contextual_info']
data = np.array([np.arange(6)]*4).T
mydf = pd.DataFrame(data, columns=columns)
mydf.Order = [1,2,3,4,5,6]
mydf.Group_code = ['group99','group2','group2','group2','group12','group12']
mydf.Grade = [80,0,60,80,85,70]
mydf.Contextual_info = [5,4,3,2,1,0]
mydf
Order Group_code Grade Contextual_info 0 1 group99 80 5 1 2 group2 0 4 2 3 group2 60 3 3 4 group2 80 2 4 5 group12 85 1 5 6 group12 70 0
Which is ordered by Order
. 按Order
排序。 I want to preserve the ordering of Group_code
by Order
, thus the column values in Group_code
should not change. 我想按Order
保留Group_code
的Order
,因此Group_code
的列值不应更改。
However, within each Group_code
group, I want to order the rows descending by Grade
. 但是, 在每个Group_code
组中,我想对按Grade
降序的行进行排序。 Finally, I will replace Order
with a new vector of integers 1... n, such that it is still 1, 2, 3, 4, 5, 6 in this example. 最后,我将Order
替换为一个整数1 ... n的新向量,因此在此示例中它仍为1、2、3、4、5、6。
Desired result: 所需结果:
Order Group_code Grade Contextual_info 1 group99 80 5 2 group2 80 2 3 group2 60 3 4 group2 0 4 5 group12 85 1 6 group12 70 0
Use 采用
In [677]: mydf.Grade = (mydf.groupby('Group_code')['Grade']
.transform(pd.Series.sort_values, ascending=False))
In [678]: mydf
Out[678]:
Order Group_code Grade 0 1 group99 80 1 2 group2 80 2 3 group2 60 3 4 group2 0 4 5 group12 85 5 6 group12 70
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.