简体   繁体   中英

Remove last n days from dataframe

I have a pandas dataframe with datetime index (30 min frequency). And I want do remove "n" last days from it. My dataframe do not include weekends, so if the last day of it is Monday, I want to remove Monday, Friday and Thursday (from the end). So, I mean observed days, not calendar. What is the most pythonic way to do it?

Thanks.

Pandas knows about Monday to Friday as business days.

So if you want to remove the last n business days from your dataframe, you can just do:

df.drop(df[df.index >= df.index.max().date()-pd.offsets.BDay(n-1)].index, inplace=True)

If you really need to remove observed days in the dataframe, if will be slightly more complex because you will have to count the days. Code could be (using a companion dataframe called df_days ):

# create a dataframe with same index and only one row per day:
df_days = pd.DataFrame(index=df.index).assign(day=df.index.date).drop_duplicates('day')
# now count the observed day in the companion dataframe
df_days['new_day'] = 1
df_days['days'] = df_days['new_day'].cumsum()
# compute first index to remove to remove last observed n days
ix = df_days.loc[df_days['days'] == df_days['days'].max() + 1 - n].index[0]

# ok drop the last observed n days from the initial dataframe and delete the companion one
df.drop(df.loc[df.index > ix].index)
del df_days

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM