简体   繁体   中英

How to slice a pandas DataFrame between two dates (day/month) ignoring the year?

I want to filter a pandas DataFrame with DatetimeIndex for multiple years between the 15th of april and the 16th of september. Afterwards I want to set a value the mask.

I was hoping for a function similar to between_time() , but this doesn't exist.

My actual solution is a loop over the unique years.

Minimal Example

import pandas as pd

df = pd.DataFrame({'target':0}, index=pd.date_range('2020-01-01', '2022-01-01', freq='H'))

start_date = "04-15"
end_date = "09-16"
for year in df.index.year.unique():
    # normal approche
    # df[f'{year}-{start_date}':f'{year}-{end_date}'] = 1

    # similar approche slightly faster
    df.iloc[df.index.get_loc(f'{year}-{start_date}'):df.index.get_loc(f'{year}-{end_date}')+1]=1

Does a solution exist where I can avoid the loop and maybe improve the performance?

To get the dates between April 1st and October 31st, what about using the month?

df.loc[df.index.month.isin(range(4, 10)), 'target'] == 1

If you want to map any date/time, just ignoring the year, you can replace the year to 2000 (leap year) and use:

s = pd.to_datetime(df.index.strftime('2000-%m-%d'))
df.loc[(s >= '2000-04-15') & (s <= '2020-09-16'), 'target'] = 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM