[英]How to use shift in pandas based on ranking column
I have data like this.我有这样的数据。
My Data looks like this.我的数据看起来像这样。
I want to get Previous Datetime based on rank.我想根据排名获得上一个日期时间。 When i use pandas shift(1) function I get Previous Datetime as '2019/10/15 00:00:00' instead of '2019/10/11 08:31:00' on 9th row and same way it happens for other rank groups.当我使用 pandas shift(1) function 我得到上一个日期时间为 '2019/10/15 00:00:00' 而不是 '2019/10/11 08:31:00' 而在第 9 行同样发生对组进行排名。 I want same previous time when rank is same.当等级相同时,我想要相同的上次时间。 Below are required results.以下是所需的结果。
+------+---------------------+-----------------------+------+
| Rank | DateTime | Elapsed Time Previous | Name |
+------+---------------------+-----------------------+------+
| 1 | 2019/09/23 08:45:00 | | |
| 2 | 2019/09/27 10:14:00 | 2019/09/23 08:45:00 | |
| 3 | 2019/10/01 09:28:00 | 2019/09/27 10:14:00 | |
| 4 | 2019/10/04 14:25:00 | 2019/10/01 09:28:00 | |
| 5 | 2019/10/08 10:46:00 | 2019/10/04 14:25:00 | |
| 6 | 2019/10/11 08:25:00 | 2019/10/08 10:46:00 | |
| 7 | 2019/10/11 08:31:00 | 2019/10/11 08:25:00 | |
| 8 | 2019/10/15 00:00:00 | 2019/10/11 08:31:00 | |
| 8 | 2019/10/15 00:00:00 | 2019/10/11 08:31:00 | |
| 1 | 2019/09/06 00:00:00 | | |
| 2 | 2019/09/10 00:00:00 | 2019/09/06 00:00:00 | |
| 3 | 2019/09/13 00:00:00 | 2019/09/10 00:00:00 | |
| 4 | 2019/09/17 00:00:00 | 2019/09/13 00:00:00 | |
| 5 | 2019/09/20 10:00:00 | 2019/09/17 00:00:00 | |
| 6 | 2019/09/24 00:00:00 | 2019/09/20 10:00:00 | |
| 7 | 2019/09/27 10:53:00 | 2019/09/24 00:00:00 | |
| 8 | 2019/10/01 10:21:00 | 2019/09/27 10:53:00 | |
| 9 | 2019/10/04 09:59:00 | 2019/10/01 10:21:00 | |
| 10 | 2019/10/08 09:58:00 | 2019/10/04 09:59:00 | |
| 11 | 2019/10/11 10:41:00 | 2019/10/08 09:58:00 | |
| 1 | 2019/09/23 09:00:00 | | |
| 2 | 2019/09/27 11:03:00 | 2019/09/23 09:00:00 | |
| 3 | 2019/10/01 10:14:00 | 2019/09/27 11:03:00 | |
| 4 | 2019/10/04 09:46:00 | 2019/10/01 10:14:00 | |
| 5 | 2019/10/08 10:04:00 | 2019/10/04 09:46:00 | |
| 6 | 2019/10/11 10:33:00 | 2019/10/08 10:04:00 | |
| 7 | 2019/10/15 00:00:00 | 2019/10/11 10:33:00 | |
| 7 | 2019/10/15 00:00:00 | 2019/10/11 10:33:00 | |
+------+---------------------+-----------------------+------+
Use DataFrame.drop_duplicates
with Series.shift
of Series
after convert Rank
to index, so last is possible use Series.map
:将Rank
转换为索引后,将DataFrame.drop_duplicates
与Series.shift
的Series
一起使用,因此最后可以使用Series.map
:
df['DateTime'] = pd.to_datetime(df['DateTime'])
s = df.drop_duplicates('Rank').set_index('Rank')['DateTime'].shift()
df['Previous Datetime'] = df['Rank'].map(s)
print (df)
DateTime Previous Datetime Rank
0 2019-09-06 00:00:00 NaT 1
1 2019-09-10 00:00:00 2019-09-06 00:00:00 2
2 2019-09-13 00:00:00 2019-09-10 00:00:00 3
3 2019-09-17 00:00:00 2019-09-13 00:00:00 4
4 2019-09-20 10:00:00 2019-09-17 00:00:00 5
5 2019-09-24 00:00:00 2019-09-20 10:00:00 6
6 2019-09-27 10:21:00 2019-09-24 00:00:00 7
7 2019-10-01 00:00:00 2019-09-27 10:21:00 8
8 2019-10-01 00:00:00 2019-09-27 10:21:00 8
EDIT:编辑:
df = df.drop('Elapsed Time Previous', axsi=1)
df['DateTime'] = pd.to_datetime(df['DateTime'])
# df['Elapsed Time Previous'] =
s = (df.drop_duplicates(['Rank','Name', 'ID'])
.set_index(['Name', 'ID', 'Rank'])['DateTime']
.unstack()
.shift(axis=1)
.stack()
.rename('Elapsed Time Previous'))
df = df.join(s, on=['Name','ID','Rank'])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.