简体   繁体   中英

Counting values in pandas dataframe of non-nan values with date comparison condition

I have the following dataframe:

Date_1                  Date_2                  Date_3
2019-12-18 13:43:47                             2019-12-18 13:43:47
2019-12-18 13:43:48     2019-12-18 13:43:47     
2020-12-18 17:51:17
2020-12-18 17:51:17     2020-12-18 17:51:17     2020-12-18 17:51:17

I am trying to count the number of values present in each columns if they meet a condition that the date is more than today .

My code:

today=pd.Timestamp.today() - pd.Timedelta(days=1)

total_date_1_events = len([df['Date_1']>today])+1
total_date_2_events = len([df['Date_2']>today])+1
total_date_3_events = len([df['Date_3']>today])+1

If I print each of my 3 variables they all output the same result which is 4, I understand that is because empty rows are being counted as well.

I would like to get the following results:

total_date_1_events = 2 # because there are only 2 dates that are bigger than today
total_date_2_events = 1 # because there are only 1 date that is bigger than today
total_date_3_events = 1 # because there are only 1 date that is bigger than today

Thank you for your suggestions.

Simply do:

sum(df.Date_1>pd.Timestamp.today())
sum(df.Date_1>pd.Timestamp.today())
sum(df.Date_1>pd.Timestamp.today())

Pandas way Series.sum and Series.gt :

df['Date_1'].gt(today).sum()

if you need it for more column you could do:

s = df[['Date_1','Date_2','Date_3']].gt(today).sum()

this create a Series. YOu can acces to values using:

s['Date_1']
s['Date_2'] 

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM