熊猫对A列进行排序，并按B列进行排序

Question

Currently I have the following python code 目前我有以下python代码

forumposts = pd.DataFrame({'UserId': [1,1,2,3,2,1,3], 'FirstPostDate': [2018,2018,2017,2019,2017,2018,2019], 'PostDate': [201801,201802,201701,201901,201801,201803,201902]})

data = forumposts.groupby(['UserId', 'PostDate','FirstPostDate']).size().reset_index()

rankedUserIdByFirstPostDate = data.groupby(['UserId', 'FirstPostDate']).size().reset_index().sort_values('FirstPostDate').reset_index(drop=True).reset_index()

data.loc[:,'Rank'] = data.merge(rankedUserIdByFirstPostDate , how='left', on='UserId')['index'].values

The code works as intended but its complicated is there a more pandas like way of doing this? 该代码按预期工作，但它的复杂性是否还有更多类似的方式来实现？ The intent is the following: 目的如下：

Create a dense rank over the UserId column sorted by the FirstPostDate such that the user with the earliest posting gets rank 0 and the user with the second earliest first post gets rank 1 and so on. 在由FirstPostDate排序的UserId列上创建一个密集等级，以使最早张贴的用户的排名为0，第二最早张贴第一的用户的排名为1，依此类推。

Using forumposts.UserId.rank(method='dense') gives me a ranking but its sorted by the order of the UserId. 使用forumposts.UserId.rank(method='dense')给我一个排名，但是它按照UserId的顺序排序。

Answer 1

Use map by dictionary created by sort_values with drop_duplicates for order zipped with np.arange : 使用由sort_values和drop_duplicates创建的按字典map ，用于按np.arange压缩的np.arange ：

data = (forumposts.groupby(['UserId', 'PostDate','FirstPostDate'])
                  .size()
                  .reset_index(name='count'))

users = data.sort_values('FirstPostDate').drop_duplicates('UserId')['UserId']
d = dict(zip(users, np.arange(len(users))))
data['Rank'] = data['UserId'].map(d)
print (data)
   UserId  PostDate  FirstPostDate  count  Rank
0       1    201801           2018      1     1
1       1    201802           2018      1     1
2       1    201803           2018      1     1
3       2    201701           2017      1     0
4       2    201801           2017      1     0
5       3    201901           2019      1     2
6       3    201902           2019      1     2

Another solution: 另一个解决方案：

data['Rank'] = (data.groupby('UserId')['FirstPostDate']
                   .transform('min')
                   .rank(method='dense')
                   .sub(1)
                   .astype(int))

熊猫对A列进行排序，并按B列进行排序

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-11-22 08:21:10

熊猫对A列进行排序，并按B列进行排序

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-11-22 08:21:10

解决方案1
0 已采纳 2018-11-22 08:21:10