检查数据框中的多个字段（字符串字段和日期字段）

Question

I have a dataframe ( df ) that looks like:我有一个数据框（ df ），看起来像：

Id                  Status  Date of entry to current post  Date of entry to current payband
 1  NEW ENTRANT - EXTERNAL                       1/1/2020                         1/1/2019
 2                 CURRENT                       1/1/2020                         1/1/2020

I am trying to write a validation that returns any records that have a Date of entry to current post that is before Date of entry to current payband and the Status field is a new entrant type (there are a few hence the wildcard).我正在尝试编写一个验证，该验证返回任何记录Date of entry to current post Date of entry to current payband之前Date of entry to current payband并且Status字段是新的输入类型（因此有一些通配符）。

I have tried the following without success我尝试了以下但没有成功

df['Date of entry to current post']>df['Date of entry to current payband'] & df['Status'] =='NEW ENTRANT*')

So in this example I would like returned:所以在这个例子中，我想返回：

Id                  Status  Date of entry to current post  Date of entry to current payband
 1  NEW ENTRANT - EXTERNAL                       1/1/2020                         1/1/2019

How can I tackle this?我该如何解决这个问题？

Answer 1

If you have datetime columns for your dates, this should work:如果您的日期有日期时间列，这应该有效：

import numpy as np
df['Condition'] = np.where((df['Date of entry to current post']>df['Date of entry to current payband']) & (df['Status'] =='NEW ENTRANT*'), 1, 0)
df = df.loc[df['Condition'] == 1)

Answer 2

You are comparing to the string 'NEW ENTRANT*' meaning a string actually containing the * character.您正在与字符串'NEW ENTRANT*'进行比较，这意味着字符串实际上包含*字符。

What you want is:你想要的是：

... & df['Status'].str.match('NEW ENTRANT'))

But if the date columns actually contain strings, you will compare them in lexicographic order which is probably not what you want...但是如果日期列实际上包含字符串，您将按字典顺序比较它们，这可能不是您想要的......

检查数据框中的多个字段（字符串字段和日期字段）

问题描述

2 个解决方案

解决方案1
1 2020-03-04 10:20:30

解决方案2
1 已采纳 2020-03-04 13:50:01

检查数据框中的多个字段（字符串字段和日期字段）

问题描述

2 个解决方案

解决方案1 1 2020-03-04 10:20:30

解决方案2 1 已采纳 2020-03-04 13:50:01

解决方案1
1 2020-03-04 10:20:30

解决方案2
1 已采纳 2020-03-04 13:50:01