简体   繁体   中英

Calculate the specific time intervals between two given date values in a DataFrame

Given the following DataFrame of pandas in Python:

Displays the switching on and off of 3 bulbs at different times. Using datetime objects.

                        date          code     other              time
1  2022-02-27 15:30:21+00:00             5        ON               NaT
2  2022-02-29 17:05:21+00:00             5       OFF   2 days 01:35:00
3  2022-04-07 17:05:21+00:00             5       OFF               NaT
4  2022-04-06 16:10:21+00:00             4        ON               NaT
5  2022-04-07 15:30:21+00:00             4       OFF   0 days 23:20:00
6  2022-02-03 22:40:21+00:00             3        ON               NaT
7  2022-02-03 23:20:21+00:00             3       OFF   0 days 00:40:00
8  2022-02-04 00:20:21+00:00             3        ON               NaT
9  2022-02-04 14:30:21+00:00             3        ON               NaT
10 2022-01-31 15:30:21+00:00             3        ON               NaT
11 2022-02-04 15:35:21+00:00             3       OFF   4 days 00:05:00
12 2022-02-04 15:40:21+00:00             3       OFF               NaT
13 2022-02-04 19:40:21+00:00             3        ON               NaT
14 2022-02-06 15:35:21+00:00             3       OFF   1 days 19:55:00
15 2022-02-23 21:10:21+00:00             3        ON               NaT
16 2022-02-24 07:10:21+00:00             3       OFF   0 days 10:00:00

I want to add a new column, called nights . This column will include only for rows where the variable time is different from NaT . Information on how many nights the light bulb has been on. The night period is defined as 22:00:00 to 05:00:00.

Example of the resulting DataFrame:

                        date          code     other              time    nights
1  2022-02-27 15:30:21+00:00             5        ON               NaT      0
2  2022-02-29 17:05:21+00:00             5       OFF   2 days 01:35:00      2
3  2022-04-07 17:05:21+00:00             5       OFF               NaT      0
4  2022-04-06 16:10:21+00:00             4        ON               NaT      0
5  2022-04-07 15:30:21+00:00             4       OFF   0 days 23:20:00      1
6  2022-02-03 22:40:21+00:00             3        ON               NaT      0
7  2022-02-03 23:20:21+00:00             3       OFF   0 days 00:40:00      0
8  2022-02-04 00:20:21+00:00             3        ON               NaT      0
9  2022-02-04 14:30:21+00:00             3        ON               NaT      0
10 2022-01-31 15:30:21+00:00             3        ON               NaT      0
11 2022-02-04 15:35:21+00:00             3       OFF   4 days 00:05:00      4
12 2022-02-04 15:40:21+00:00             3       OFF               NaT      0
13 2022-02-04 19:40:21+00:00             3        ON               NaT      0
14 2022-02-06 15:35:21+00:00             3       OFF   1 days 19:55:00      2
15 2022-02-23 21:10:21+00:00             3        ON               NaT      0
16 2022-02-24 07:10:21+00:00             3       OFF   0 days 10:00:00      1

The light bulb ON register is just before the non-NaT value of the time variable.

Information added : In case the bulb is switched off and on in the middle of the night, it will not be taken into account for the variable nights . It has to spend the entire interval switched on.

                        date      code     other              time    nights
1  2022-02-27 21:00:00+00:00         1        ON               NaT      0
2  2022-02-28 01:00:00+00:00         1       OFF   0 days 04:00:00      0
3  2022-02-28 03:15:00+00:00         1        ON               NaT      0
4  2022-02-28 09:30:00+00:00         1       OFF   0 days 06:15:00      0

Not knowing how you want to handle partial nights you could try something like this.... [Updated to handle partial days better] [Update to add imports]

import pandas as pd
import datetime as dt

df['edate'] = df.date + df.time
df['first_night_start'] = pd.to_datetime(df.date.dt.date) + dt.timedelta(hours=22)
df['first_night_end'] = df['first_night_start'] + dt.timedelta(hours=7)
def get_nights(ser):
    if pd.isna(ser.time):
        return
    days = ser.time.days
    the_rest = ser.time - dt.timedelta(days)
    if ((ser.date + the_rest) > ser.first_night_end) and (ser.date < ser.first_night_start):
        return days + 1
    else:
        return days 
df['nights'] = df.apply(get_nights, axis=1)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM