[英]Counting values in pandas dataframe of non-nan values with date comparison condition
我有以下数据框:
Date_1 Date_2 Date_3
2019-12-18 13:43:47 2019-12-18 13:43:47
2019-12-18 13:43:48 2019-12-18 13:43:47
2020-12-18 17:51:17
2020-12-18 17:51:17 2020-12-18 17:51:17 2020-12-18 17:51:17
如果满足日期大于today
的条件,我正在尝试计算每列中存在的值的数量。
我的代码:
today=pd.Timestamp.today() - pd.Timedelta(days=1)
total_date_1_events = len([df['Date_1']>today])+1
total_date_2_events = len([df['Date_2']>today])+1
total_date_3_events = len([df['Date_3']>today])+1
如果我打印 3 个变量中的每一个,它们都输出相同的结果,即 4,我理解这是因为空行也被计算在内。
我想得到以下结果:
total_date_1_events = 2 # because there are only 2 dates that are bigger than today
total_date_2_events = 1 # because there are only 1 date that is bigger than today
total_date_3_events = 1 # because there are only 1 date that is bigger than today
谢谢你的建议。
简单地做:
sum(df.Date_1>pd.Timestamp.today())
sum(df.Date_1>pd.Timestamp.today())
sum(df.Date_1>pd.Timestamp.today())
熊猫方式Series.sum
和Series.gt
:
df['Date_1'].gt(today).sum()
如果您需要更多列,您可以这样做:
s = df[['Date_1','Date_2','Date_3']].gt(today).sum()
这创建了一个系列。 您可以使用以下方法访问值:
s['Date_1']
s['Date_2']
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.