简体   繁体   English

通过使用熊猫将现有列向下移动1行来创建新列

[英]Creating a new column by shifting an existing column to 1 row down using pandas

I am working on sport. 我正在从事体育运动。 The purpose is to record current eventdatetime and PreviousEventTime in a game. 目的是记录游戏中的当前eventdatetime和PreviousEventTime。 I have a sample dataset in the below link. 我在下面的链接中有一个样本数据集。

https://drive.google.com/open?id=1DUNrWPFwrkZHpq_KeA4rZCJ94sbpUEDI https://drive.google.com/open?id=1DUNrWPFwrkZHpq_KeA4rZCJ94sbpUEDI

In this file, there are 11 columns. 在此文件中,有11列。 the event are collected based on time. 该事件是基于时间收集的。 For this re-arrange, i will be using the following columns gsm_ID , eventdatetime columns 对于此重新安排,我将使用以下列gsm_IDeventdatetime

I want to create a new column PreviousEventTime that take n-1 row of the eventdatetime column. 我想创建一个新列PreviousEventTime ,该列占用eventdatetime列的n-1行。 That means for every gsm_ID , there will be the first eventdatetime . 这意味着对于每个gsm_ID ,都会有第一个eventdatetime The new column will represent the next event time as compared to the time column. 与时间列相比,新列将代表下一个事件时间。

gsm_ID eventdatetime PreviousEventTime

2462794 08/11/2017 18:46 08/11/2017 18:45

2462794 08/11/2017 18:49 08/11/2017 18:46

2462794 08/11/2017 19:13 08/11/2017 18:49

2462794 08/11/2017 19:31 08/11/2017 19:13

2462794 08/11/2017 20:09 08/11/2017 19:31

2462795 08/12/2017 17:39 08/12/2017 16:30

2462795 08/12/2017 17:44 08/12/2017 17:39

Above example is just for two games. 上面的示例仅用于两个游戏。 You can differentiate by gsm_id . 您可以通过gsm_id进行区分 The for row at PreviousEventTime will always be matchdatetime. PreviousEventTime的for行将始终为matchdatetime。 I will have 100 over games. 我将有100场比赛。 but the process will repeat as above-mentioned example. 但是该过程将如上述示例重复。

eventdata ['PreviousEventTime-1'] = eventdata.groupby(['gsm_id'])['eventdatetime'].shift(-1)

But it only works for the first gsm_ID . 但这仅适用于第一个gsm_ID It did not work for the other gsm_ID . 它不适用于其他gsm_ID The output from above script is below: 上面脚本的输出如下:

在此处输入图片说明

Your advice would be much appreciated. 您的建议将不胜感激。 Regards, zephyr 问候,西风

Sorting properly solved the problem. 排序正确解决了问题。 I added in the following sorting and indexing: 我添加了以下排序和索引:

eventdata = eventdata.set_index(['gsm_id']) .sort_index(ascending =True)

eventdata=eventdata.sort_values(['matchdatetime','time'],ascending=[True,True])

eventdata ['PreviousEventTime-1'] = eventdata.groupby(['gsm_id','matchdatetime'])['eventdatetime'].shift(1, axis = 0)

But the remaining part is to fill NaT by matchdatetime . 但是剩下的部分是用matchdatetime填充NaT。 Thanks everyone for advising me. 谢谢大家给我的建议。 Regards zephyr 关于西风

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM