I want to filter a pandas DataFrame with DatetimeIndex for multiple years between the 15th of april and the 16th of september. Afterwards I want to set a value the mask.
I was hoping for a function similar to between_time()
, but this doesn't exist.
My actual solution is a loop over the unique years.
import pandas as pd
df = pd.DataFrame({'target':0}, index=pd.date_range('2020-01-01', '2022-01-01', freq='H'))
start_date = "04-15"
end_date = "09-16"
for year in df.index.year.unique():
# normal approche
# df[f'{year}-{start_date}':f'{year}-{end_date}'] = 1
# similar approche slightly faster
df.iloc[df.index.get_loc(f'{year}-{start_date}'):df.index.get_loc(f'{year}-{end_date}')+1]=1
Does a solution exist where I can avoid the loop and maybe improve the performance?
To get the dates between April 1st and October 31st, what about using the month?
df.loc[df.index.month.isin(range(4, 10)), 'target'] == 1
If you want to map any date/time, just ignoring the year, you can replace the year to 2000 (leap year) and use:
s = pd.to_datetime(df.index.strftime('2000-%m-%d'))
df.loc[(s >= '2000-04-15') & (s <= '2020-09-16'), 'target'] = 1
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.