简体   繁体   English

如何通过在 pandas 数据帧中迭代来比较两个日期并创建一个新列

[英]how to compare two date by iterating in a pandas data frame and create a new column

I have a pandas data frame with customer transactions as shown below and create a column named 'Label' with 2 different values我有一个 pandas 数据框,其中包含如下所示的客户交易,并创建一个名为“标签”的列,其中包含 2 个不同的值

  • New Transaction performed before the end date of the previous transaction在前一笔交易的结束日期之前执行的新交易

  • New Transaction performed after the end date of the previous transaction在上一个交易的结束日期之后执行的新交易

Input输入

Transaction ID    Transaction Start Date  Transaction End Date 

      1               23-jun-2014              15-Jul-2014

      2               14-jul-2014              8-Aug-2014        

      3               13-Aug-2014              22-Aug-2014        

      4               21-Aug-2014              28-Aug-2014      

      5               29-Aug-2014              05-Sep-2014

      6               06-Sep-2014              15-Sep-2014

Desired output所需 output

Transaction ID    Transaction Start Date  Transaction End Date  Label

  1               23-jun-2014              15-Jul-2014

  2               14-jul-2014              8-Aug-2014       New Transaction performed before end date of previous transaction

  3               13-Aug-2014              22-Aug-2014      New Transaction after the end date of previous transaction.    

  4               21-Aug-2014              28-Aug-2014      New Transaction performed before the end date of previous transaction.

  5               29-Aug-2014              05-Sep-2014      New Transaction after the end date of previous transaction.

  6               06-Sep-2014              15-Sep-2014      New Transaction after the end date of previous transaction.

Use numpy.where and Series.shift :使用numpy.whereSeries.shift

import numpy as np

df['Label'] = np.where(df['Transaction Start Date'].lt(df['Transaction End Date'].shift()), 'New Transaction performed before end date of previous transaction', 'New Transaction after the end date of previous transaction.')

Use to_datetime first, then numpy.where with Series.lt form less compred shifted values by Series.shift and last set first value to empty string:首先使用to_datetime ,然后使用Series.shiftnumpy.where通过Series.lt形成较少压缩的移位值,最后将第一个值设置为空字符串:

df['Transaction End Date'] = pd.to_datetime(df['Transaction End Date'])
df['Transaction Start Date'] = pd.to_datetime(df['Transaction Start Date'])

df['Label'] = np.where(df['Transaction Start Date'].lt(df['Transaction End Date'].shift()), 
                       'New Transaction performed before end date of previous transaction', 
                       'New Transaction after the end date of previous transaction.')
df.loc[0, 'Label'] = ''

Alternative solution:替代解决方案:

m = df['Transaction Start Date'].lt(df['Transaction End Date'].shift())

df['Label'] = [''] + np.where(m, 
              'New Transaction performed before end date of previous transaction', 
              'New Transaction after the end date of previous transaction.')[1:].tolist()

print (df)
   Transaction ID Transaction Start Date Transaction End Date  \
0               1             2014-06-23           2014-07-15   
1               2             2014-07-14           2014-08-08   
2               3             2014-08-13           2014-08-22   
3               4             2014-08-21           2014-08-28   
4               5             2014-08-29           2014-09-05   
5               6             2014-09-06           2014-09-15   

                                               Label  
                                                     
1  New Transaction performed before end date of p...  
2  New Transaction after the end date of previous...  
3  New Transaction performed before end date of p...  
4  New Transaction after the end date of previous...  
5  New Transaction after the end date of previous...  

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 比较两个熊猫数据框列的元素,并基于第三列创建一个新列 - Compare elements of two pandas data frame columns and create a new column based on a third column 如何在迭代pandas数据帧时创建新列并插入行值 - How to create new column and insert row values while iterating through pandas data frame 如何在 pandas 数据框中创建新列 - How to create a new column in a pandas data frame 将 pandas dataframe 中的两个日期与当前日期进行比较并创建新列? - Compare two dates in pandas dataframe with current date and create new column? 如何从 pandas 数据框的列值创建新行 - How to create a new rows from column values of pandas data frame 如果日期在其他两个列中的两个日期之间,则求和并分组并创建新的分组数据框 - pandas - Sum and groupby if date is between two dates in two other columns and create new groupby data frame - pandas 比较两个熊猫数据框架 - compare two pandas data frame 如何将时间量(第 1 列)添加到 pandas 数据框中的日期/时间(第 2 列)作为新列? - How do you add an amount of time (column 1) to a date/time (column 2) in a pandas data frame as a new column? 如何比较两个不同数据框的两列并添加新的结果列 - How can I compare two columns of two different data frame and add new resultant column 如何比较样本数量不同的一个字符串列上的两个数据框熊猫 - how to compare two data frame on one string column that the number of samples are different pandas
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM