简体   繁体   中英

Pandas dataframe shift column by date

I have a panel dataset which is indexed by Date and ID and looks something like this:

df = pd.DataFrame({'Date':['2005-12-31', '2006-03-31', '2006-09-30','2005-12-31', '2006-03-31', '2006-06-30', '2006-09-30'],
              'ID':[1,1,1,2,2,2,2],
              'Value':[14,25,34,23,67,14,46]})

I'm trying to shift the values of the same ID by Date and Date can be non-continuous quarters. groupby.shift doesn't give me the right thing or maybe I'm missing something. Here is what I did:

df['pre_value'] = df.groupby('ID')['Value'].shift(1)

This does shift values of the same ID, but it ignores the date... note that for ID==1 , the 2006-06-30 is missing and therefore the pre_value for its 2006-09-30 should really be NaN. I'm also looking into multiindexing or declaring the dataset as panel, but that complicates my other calculations. Is there any easy way to do this with dataframe?

I would just make a copy of the dataframe, shift Date by 1 (seems you want shift by a quarter), and then merge back to the original dataframe. To shift date, you can convert string dates to pandas period so shifting will be easier.

In [34]: df['Date'] = pd.PeriodIndex(df['Date'], freq='Q')

In [35]: df2 = df.copy()

In [36]: df2['Date'] += 1

In [37]: df.merge(df2, on=['Date','ID'], suffixes=('', '_lag1'), how='left')
Out[37]:
    Date  ID  Value  Value_lag1
0 2005Q4   1     14         NaN
1 2006Q1   1     25          14
2 2006Q3   1     34         NaN
3 2005Q4   2     23         NaN
4 2006Q1   2     67          23
5 2006Q2   2     14          67
6 2006Q3   2     46          14

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM