[英]How to do conditional calculations in Python
I have a sample data that looks like this.我有一个看起来像这样的示例数据。
Column DateDuration was calculated in Excel, following below logic:列 DateDuration 在 Excel 中计算,遵循以下逻辑:
I would like to calculate DateDuration in Python but do not know how to do about this.我想用 Python 计算 DateDuration 但不知道该怎么做。
Types of data in Python: Python中的数据类型:
I am new to Python.我是 Python 新手。 Any help would be greatly appreciated!!
任何帮助将不胜感激!!
import pandas as pd
import numpy as np
df['FirstDate'] = pd.to_datetime(df['FirstDate'])
df['SecondDate'] = pd.to_datetime(df['SecondDate'])
df['DayDifference2'] = (df['SecondDate']) -(df['FirstDate'])
df['DayDifference3'] = (df['ThirdDate']) -(df['FirstDate'])
df['DayDifference4'] = (df['FourthDate']) -(df['FirstDate'])
x = df['DayDifference2'].dt.days
y = df['DayDifference3'].dt.days
z = df['DayDifference4'].dt.days
condlist = [x<28, x>=28]
choicelist = [(df['ThirdDate']) -(df['FirstDate']), (df['SecondDate']) -(df['FirstDate'])]
np.select(condlist, choicelist)
My data:我的数据:
ID ![]() |
FirstDate![]() |
SecondDate![]() |
ThirdDate![]() |
FourthDate![]() |
DateDuration![]() |
---|---|---|---|---|---|
2914300 ![]() |
2021-09-23 ![]() |
2021-10-07 ![]() |
2021-11-29 ![]() |
2021-12-20 ![]() |
67 ![]() |
3893461 ![]() |
2021-09-08 ![]() |
2021-10-06 ![]() |
2022-04-07 ![]() |
211 ![]() |
|
4343075 ![]() |
2021-06-23 ![]() |
2021-09-27 ![]() |
96 ![]() |
||
4347772 ![]() |
2021-06-23 ![]() |
2021-09-27 ![]() |
96 ![]() |
||
4551963 ![]() |
2021-08-02 ![]() |
2021-10-14 ![]() |
2022-03-11 ![]() |
73 ![]() |
|
4893324 ![]() |
2021-09-30 ![]() |
2021-10-01 ![]() |
2022-03-03 ![]() |
2022-03-10 ![]() |
154 ![]() |
5239991 ![]() |
2021-06-24 ![]() |
2021-08-26 ![]() |
2021-09-25 ![]() |
2022-02-03 ![]() |
63 ![]() |
8454947 ![]() |
2021-09-28 ![]() |
2021-10-05 ![]() |
7 ![]() |
||
8581390 ![]() |
2021-09-27 ![]() |
2022-03-21 ![]() |
2022-03-25 ![]() |
175 ![]() |
|
8763766 ![]() |
2021-09-20 ![]() |
2021-10-04 ![]() |
2021-12-09 ![]() |
80 ![]() |
|
9144185 ![]() |
2021-06-18 ![]() |
2021-06-23 ![]() |
5 ![]() |
||
9967685 ![]() |
2021-09-13 ![]() |
2021-10-29 ![]() |
2022-02-07 ![]() |
2022-03-23 ![]() |
46 ![]() |
11367560 ![]() |
2021-08-31 ![]() |
2021-09-28 ![]() |
2021-10-21 ![]() |
2022-02-11 ![]() |
51 ![]() |
Refer to the time module built-in.参考内置的时间模块。 It allows for more time class types that I actually used for my own workout routine maker.
它允许我实际用于我自己的锻炼程序制造商的更多时间课程类型。
import datetime as dt
# particularly the types:
dt.timedelta(1)
# and
dt.time(minute=0, second=0)
# there are also date classes you can use.
Documentation: https://docs.python.org/3/library/datetime.html文档: https ://docs.python.org/3/library/datetime.html
import pandas as pd
import numpy as np
df = pd.read_csv('date_example.csv')
df.loc[:,'FirstDate':'FourthDate'] = df.loc[:,'FirstDate':'FourthDate'].astype('datetime64[ns]')
df
NaT
is a missing value of datetime64[ns]
type NaT
是datetime64[ns]
类型的缺失值
Conditions & Choices条件和选择
conditions = [
(df['SecondDate'] - df['FirstDate']).dt.days >= 28,
((df['SecondDate'] - df['FirstDate']).dt.days < 28) & df['ThirdDate'].isna(),
((df['SecondDate'] - df['FirstDate']).dt.days < 28) & df['ThirdDate'].notna() & ((df['ThirdDate'] - df['FirstDate']).dt.days >= 28),
((df['SecondDate'] - df['FirstDate']).dt.days < 28) & df['ThirdDate'].notna() & ((df['ThirdDate'] - df['FirstDate']).dt.days < 28) & df['FourthDate'].isna(),
((df['SecondDate'] - df['FirstDate']).dt.days < 28) & df['ThirdDate'].notna() & ((df['ThirdDate'] - df['FirstDate']).dt.days < 28) & df['FourthDate'].notna()
]
choices = [
(df['SecondDate'] - df['FirstDate']).dt.days,
(df['SecondDate'] - df['FirstDate']).dt.days,
(df['ThirdDate'] - df['FirstDate']).dt.days,
(df['ThirdDate'] - df['FirstDate']).dt.days,
(df['FourthDate'] - df['FirstDate']).dt.days
]
df['Duration'] = np.select(conditions, choices)
df
Discussion: There are some differences, eg, second row, ID = 3893461
, according to your conditions(DateDuration between SecondDate and FirstDate >= 28, then DateDuration = SecondDate - FirstDate.), SecondDate
- FirstDate
of ID = 3893461
is 28
, same thing happened on last row, ID = 11367560
讨论:有一些差异,例如,第二行,
ID = 3893461
,根据您的条件(SecondDate 和 FirstDate 之间的 DateDuration >= 28,然后 DateDuration = SecondDate - FirstDate.), SecondDate
- ID = 3893461
FirstDate
FirstDate 是28
,相同最后一行发生的事情, ID = 11367560
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.