简体   繁体   English

基于多个日期条件过滤数据框

[英]Filtering Dataframe based on Multiple Date Conditions

I'm working with the following DataFrame:我正在使用以下 DataFrame:

id  slotTime    EDD EDD-10M
0   1000000101068957    2021-05-12  2021-12-26  2021-02-26
1   1000000100849718    2021-03-20  2021-04-05  2020-06-05
2   1000000100849718    2021-03-20  2021-04-05  2020-06-05
3   1000000100849718    2021-03-20  2021-04-05  2020-06-05
4   1000000100849718    2021-03-20  2021-04-05  2020-06-05

I would like to only keep the rows where the slotTime is between EDD-10M and EDD :我只想保留slotTime介于EDD-10MEDD之间的行:

df['EDD-10M'] < df['slotTime'] < df['EDD']]

I have tried using the following method:我尝试使用以下方法:

df.loc[df[df['slotTime'] < df['EDD']] & df[df['EDD-10M'] < df['slotTime']]]

However it yields the following error但是它会产生以下错误

TypeError: ufunc 'bitwise_and' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Please Advise.请指教。

To replicate the above DataFrame use the below snippet:要复制上述数据帧,请使用以下代码段:

import pandas as pd
from pandas import Timestamp

df = { 
  'id': {0: 1000000101068957,
  1: 1000000100849718,
  2: 1000000100849718,
  3: 1000000100849718,
  4: 1000000100849718,
  5: 1000000100849718,
  6: 1000000100849718,
  7: 1000000100849718,
  8: 1000000100849718,
  9: 1000000100849718},
  'EDD': {0: Timestamp('2021-12-26 00:00:00'),
  1: Timestamp('2021-04-05 00:00:00'),
  2: Timestamp('2021-04-05 00:00:00'),
  3: Timestamp('2021-04-05 00:00:00'),
  4: Timestamp('2021-04-05 00:00:00'),
  5: Timestamp('2021-04-05 00:00:00'),
  6: Timestamp('2021-04-05 00:00:00'),
  7: Timestamp('2021-04-05 00:00:00'),
  8: Timestamp('2021-04-05 00:00:00'),
  9: Timestamp('2021-04-05 00:00:00')},
 'EDD-10M': {0: Timestamp('2021-02-26 00:00:00'),
  1: Timestamp('2020-06-05 00:00:00'),
  2: Timestamp('2020-06-05 00:00:00'),
  3: Timestamp('2020-06-05 00:00:00'),
  4: Timestamp('2020-06-05 00:00:00'),
  5: Timestamp('2020-06-05 00:00:00'),
  6: Timestamp('2020-06-05 00:00:00'),
  7: Timestamp('2020-06-05 00:00:00'),
  8: Timestamp('2020-06-05 00:00:00'),
  9: Timestamp('2020-06-05 00:00:00')},
 'slotTime': {0: Timestamp('2021-05-12 00:00:00'),
  1: Timestamp('2021-03-20 00:00:00'),
  2: Timestamp('2021-03-20 00:00:00'),
  3: Timestamp('2021-03-20 00:00:00'),
  4: Timestamp('2021-03-20 00:00:00'),
  5: Timestamp('2021-03-20 00:00:00'),
  6: Timestamp('2021-03-20 00:00:00'),
  7: Timestamp('2021-03-20 00:00:00'),
  8: Timestamp('2021-03-20 00:00:00'),
  9: Timestamp('2021-03-20 00:00:00')}}

df = pd.DataFrame(df)

you just need to group your sides你只需要把你的两边分组

df[(df['slotTime'] < df['EDD']) & (df['EDD-10M'] < df['slotTime'])]

otherwise order of operations tries to & things first and it all falls apart否则,操作顺序会先尝试 & 事情,然后一切都会崩溃

alternatively you may wish to use the .between operator (assuming you have a datetime series或者,您可能希望使用 .between 运算符(假设您有一个日期时间序列

df[df['slotTime'].between(df['EDD'],df['EDD-10M'])]

you can use between() method someone already answered you or try like this您可以使用已经有人回答过您的 between() 方法或尝试这样

df.loc[(df['EDD-10M'] < df['slotTime']) & (df['slotTime'] < df['EDD'])]

you should use ( and ) multiple conditions您应该使用 ( 和 ) 多个条件

您可以通过使用query来做到这一点:

df.query("(slotTime < EDD) & (`EDD-10M` < slotTime)")

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM