简体   繁体   中英

extend a pandas datetimeindex by 1 period

consider the DateTimeIndex dates

dates = pd.date_range('2016-01-29', periods=4, freq='BM')
dates

DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29'],
              dtype='datetime64[ns]', freq='BM')

I want to extend the index by one period at the frequency attached to the object.


I expect

pd.date_range('2016-01-29', periods=5, freq='BM')

DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29',
               '2016-05-31'],
              dtype='datetime64[ns]', freq='BM')

I've tried

dates.append(dates[[-1]] + pd.offsets.BusinessMonthEnd())

However

  • Not generalized to use frequency of dates
  • I get a performance warning

    PerformanceWarning: Non-vectorized DateOffset being applied to Series or DatetimeIndex

The timestamps in your DatetimeIndex already know that they are describing business month ends, so you can simply add 1:

import pandas as pd
dates = pd.date_range('2016-01-29', periods=4, freq='BM')

print(repr(dates[-1]))
# => Timestamp('2016-04-29 00:00:00', offset='BM')

print(repr(dates[-1] + 1))
# => Timestamp('2016-05-31 00:00:00', offset='BM')

You can add the latter to your index using .union :

dates = dates.union([dates[-1] + 1])
print(dates)
# => DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29',
#                   '2016-05-31'],
#                  dtype='datetime64[ns]', freq='BM')

Compared to .append , this retains knowledge of the offset.

pandas==1.1.1 Answer for +1

To follow up on this, for pandas==1.1.1 , I found this to be the best solution:

dates.union(pd.date_range(dates[-1] + dates.freq, periods=1, freq=dates.freq))

Generalised Answer Using n

n=3
dates.union(pd.date_range(dates[-1] + dates.freq, periods=n, freq=dates.freq))

Credits

Taken by combining @alberto-garcia-raboso's answer and @ballpointben's comment.

What Didn't Work

  • The following just got formatted to an Index , not a DateTimeIndex : dates.union([dates[-1] + dates.freq])
  • Also dates[-1] + 1 is deprecated.

The best solution is:

import pandas as pd
dates = pd.date_range('2016-01-29', periods=4, freq='BM')
extended = dates.union(dates.shift(n)[-n:])

where n is the number of periods you want to add. With n=4 , you will get an extended date range like this:

DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29',
               '2016-05-31', '2016-06-30', '2016-07-29', '2016-08-31'],
              dtype='datetime64[ns]', freq='BM')

try this:

In [207]: dates = dates.append(pd.DatetimeIndex(pd.Series(dates[-1] + pd.offsets.BusinessMonthEnd())))

In [208]: dates
Out[208]: DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29', '2016-05-31'], dtype='datetime64[ns]', freq=None)

or using list ( [...] ) instead of pd.Series() :

In [211]: dates.append(pd.DatetimeIndex([dates[-1] + pd.offsets.BusinessMonthEnd()]))
Out[211]: DatetimeIndex(['2016-01-29', '2016-02-29', '2016-03-31', '2016-04-29', '2016-05-31'], dtype='datetime64[ns]', freq=None)

I'd use the .tshift function and then use accordingly:

dr = pd.date_range(start='1/1/2020', periods=5, freq='D')
df = pd.DataFrame(data=[1,2,3,4,5], 
                  index=dr,
                  columns=['A'])
df.head()
            A
2020-01-01  1
2020-01-02  2
2020-01-03  3
2020-01-04  4
2020-01-05  5 <-

df.tshift()
            A
2020-01-02  1
2020-01-03  2
2020-01-04  3
2020-01-05  4
2020-01-06  5 <-

other = pd.DataFrame([6], columns=['A'], index=[df.tshift().index[-1]])
other.head()
            A
2020-01-06  6

df.append(other)
            A
2020-01-01  1
2020-01-02  2
2020-01-03  3
2020-01-04  4
2020-01-05  5
2020-01-06  6 <-

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM