简体   繁体   English

如何根据排名列在 pandas 中使用 shift

[英]How to use shift in pandas based on ranking column

I have data like this.我有这样的数据。

My Data looks like this.我的数据看起来像这样。

I want to get Previous Datetime based on rank.我想根据排名获得上一个日期时间。 When i use pandas shift(1) function I get Previous Datetime as '2019/10/15 00:00:00' instead of '2019/10/11 08:31:00' on 9th row and same way it happens for other rank groups.当我使用 pandas shift(1) function 我得到上一个日期时间为 '2019/10/15 00:00:00' 而不是 '2019/10/11 08:31:00' 而在第 9 行同样发生对组进行排名。 I want same previous time when rank is same.当等级相同时,我想要相同的上次时间。 Below are required results.以下是所需的结果。

  +------+---------------------+-----------------------+------+
| Rank |      DateTime       | Elapsed Time Previous | Name |
+------+---------------------+-----------------------+------+
|    1 | 2019/09/23 08:45:00 |                       |      |
|    2 | 2019/09/27 10:14:00 | 2019/09/23 08:45:00   |      |
|    3 | 2019/10/01 09:28:00 | 2019/09/27 10:14:00   |      |
|    4 | 2019/10/04 14:25:00 | 2019/10/01 09:28:00   |      |
|    5 | 2019/10/08 10:46:00 | 2019/10/04 14:25:00   |      |
|    6 | 2019/10/11 08:25:00 | 2019/10/08 10:46:00   |      |
|    7 | 2019/10/11 08:31:00 | 2019/10/11 08:25:00   |      |
|    8 | 2019/10/15 00:00:00 | 2019/10/11 08:31:00   |      |
|    8 | 2019/10/15 00:00:00 | 2019/10/11 08:31:00   |      |
|    1 | 2019/09/06 00:00:00 |                       |      |
|    2 | 2019/09/10 00:00:00 | 2019/09/06 00:00:00   |      |
|    3 | 2019/09/13 00:00:00 | 2019/09/10 00:00:00   |      |
|    4 | 2019/09/17 00:00:00 | 2019/09/13 00:00:00   |      |
|    5 | 2019/09/20 10:00:00 | 2019/09/17 00:00:00   |      |
|    6 | 2019/09/24 00:00:00 | 2019/09/20 10:00:00   |      |
|    7 | 2019/09/27 10:53:00 | 2019/09/24 00:00:00   |      |
|    8 | 2019/10/01 10:21:00 | 2019/09/27 10:53:00   |      |
|    9 | 2019/10/04 09:59:00 | 2019/10/01 10:21:00   |      |
|   10 | 2019/10/08 09:58:00 | 2019/10/04 09:59:00   |      |
|   11 | 2019/10/11 10:41:00 | 2019/10/08 09:58:00   |      |
|    1 | 2019/09/23 09:00:00 |                       |      |
|    2 | 2019/09/27 11:03:00 | 2019/09/23 09:00:00   |      |
|    3 | 2019/10/01 10:14:00 | 2019/09/27 11:03:00   |      |
|    4 | 2019/10/04 09:46:00 | 2019/10/01 10:14:00   |      |
|    5 | 2019/10/08 10:04:00 | 2019/10/04 09:46:00   |      |
|    6 | 2019/10/11 10:33:00 | 2019/10/08 10:04:00   |      |
|    7 | 2019/10/15 00:00:00 | 2019/10/11 10:33:00   |      |
|    7 | 2019/10/15 00:00:00 | 2019/10/11 10:33:00   |      |
+------+---------------------+-----------------------+------+

Use DataFrame.drop_duplicates with Series.shift of Series after convert Rank to index, so last is possible use Series.map :Rank转换为索引后,将DataFrame.drop_duplicatesSeries.shiftSeries一起使用,因此最后可以使用Series.map

df['DateTime'] = pd.to_datetime(df['DateTime'])

s = df.drop_duplicates('Rank').set_index('Rank')['DateTime'].shift()

df['Previous Datetime'] = df['Rank'].map(s)
print (df)
             DateTime   Previous Datetime  Rank
0 2019-09-06 00:00:00                 NaT     1
1 2019-09-10 00:00:00 2019-09-06 00:00:00     2
2 2019-09-13 00:00:00 2019-09-10 00:00:00     3
3 2019-09-17 00:00:00 2019-09-13 00:00:00     4
4 2019-09-20 10:00:00 2019-09-17 00:00:00     5
5 2019-09-24 00:00:00 2019-09-20 10:00:00     6
6 2019-09-27 10:21:00 2019-09-24 00:00:00     7
7 2019-10-01 00:00:00 2019-09-27 10:21:00     8
8 2019-10-01 00:00:00 2019-09-27 10:21:00     8

EDIT:编辑:

df = df.drop('Elapsed Time Previous', axsi=1)

df['DateTime'] = pd.to_datetime(df['DateTime'])

# df['Elapsed Time Previous'] = 
s = (df.drop_duplicates(['Rank','Name', 'ID'])
       .set_index(['Name', 'ID', 'Rank'])['DateTime']
       .unstack()
       .shift(axis=1)
       .stack()
       .rename('Elapsed Time Previous'))

df = df.join(s, on=['Name','ID','Rank'])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM