[英]Creating a new column by shifting an existing column to 1 row down using pandas
I am working on sport. 我正在从事体育运动。 The purpose is to record current eventdatetime and PreviousEventTime in a game.
目的是记录游戏中的当前eventdatetime和PreviousEventTime。 I have a sample dataset in the below link.
我在下面的链接中有一个样本数据集。
https://drive.google.com/open?id=1DUNrWPFwrkZHpq_KeA4rZCJ94sbpUEDI https://drive.google.com/open?id=1DUNrWPFwrkZHpq_KeA4rZCJ94sbpUEDI
In this file, there are 11 columns. 在此文件中,有11列。 the event are collected based on time.
该事件是基于时间收集的。 For this re-arrange, i will be using the following columns gsm_ID , eventdatetime columns
对于此重新安排,我将使用以下列gsm_ID , eventdatetime列
I want to create a new column PreviousEventTime that take n-1 row of the eventdatetime column. 我想创建一个新列PreviousEventTime ,该列占用eventdatetime列的n-1行。 That means for every gsm_ID , there will be the first eventdatetime .
这意味着对于每个gsm_ID ,都会有第一个eventdatetime 。 The new column will represent the next event time as compared to the time column.
与时间列相比,新列将代表下一个事件时间。
gsm_ID eventdatetime PreviousEventTime
2462794 08/11/2017 18:46 08/11/2017 18:45
2462794 08/11/2017 18:49 08/11/2017 18:46
2462794 08/11/2017 19:13 08/11/2017 18:49
2462794 08/11/2017 19:31 08/11/2017 19:13
2462794 08/11/2017 20:09 08/11/2017 19:31
2462795 08/12/2017 17:39 08/12/2017 16:30
2462795 08/12/2017 17:44 08/12/2017 17:39
Above example is just for two games. 上面的示例仅用于两个游戏。 You can differentiate by gsm_id .
您可以通过gsm_id进行区分 。 The for row at PreviousEventTime will always be matchdatetime.
PreviousEventTime的for行将始终为matchdatetime。 I will have 100 over games.
我将有100场比赛。 but the process will repeat as above-mentioned example.
但是该过程将如上述示例重复。
eventdata ['PreviousEventTime-1'] = eventdata.groupby(['gsm_id'])['eventdatetime'].shift(-1)
But it only works for the first gsm_ID . 但这仅适用于第一个gsm_ID 。 It did not work for the other gsm_ID .
它不适用于其他gsm_ID 。 The output from above script is below:
上面脚本的输出如下:
Your advice would be much appreciated. 您的建议将不胜感激。 Regards, zephyr
问候,西风
Sorting properly solved the problem. 排序正确解决了问题。 I added in the following sorting and indexing:
我添加了以下排序和索引:
eventdata = eventdata.set_index(['gsm_id']) .sort_index(ascending =True)
eventdata=eventdata.sort_values(['matchdatetime','time'],ascending=[True,True])
eventdata ['PreviousEventTime-1'] = eventdata.groupby(['gsm_id','matchdatetime'])['eventdatetime'].shift(1, axis = 0)
But the remaining part is to fill NaT by matchdatetime . 但是剩下的部分是用matchdatetime填充NaT。 Thanks everyone for advising me.
谢谢大家给我的建议。 Regards zephyr
关于西风
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.