I have a panel dataset which is indexed by Date and ID and looks something like this:
df = pd.DataFrame({'Date':['2005-12-31', '2006-03-31', '2006-09-30','2005-12-31', '2006-03-31', '2006-06-30', '2006-09-30'],
'ID':[1,1,1,2,2,2,2],
'Value':[14,25,34,23,67,14,46]})
I'm trying to shift the values of the same ID by Date and Date can be non-continuous quarters. groupby.shift doesn't give me the right thing or maybe I'm missing something. Here is what I did:
df['pre_value'] = df.groupby('ID')['Value'].shift(1)
This does shift values of the same ID, but it ignores the date... note that for ID==1
, the 2006-06-30
is missing and therefore the pre_value
for its 2006-09-30
should really be NaN. I'm also looking into multiindexing or declaring the dataset as panel, but that complicates my other calculations. Is there any easy way to do this with dataframe?
I would just make a copy of the dataframe, shift Date
by 1 (seems you want shift by a quarter), and then merge back to the original dataframe. To shift date, you can convert string dates to pandas period so shifting will be easier.
In [34]: df['Date'] = pd.PeriodIndex(df['Date'], freq='Q')
In [35]: df2 = df.copy()
In [36]: df2['Date'] += 1
In [37]: df.merge(df2, on=['Date','ID'], suffixes=('', '_lag1'), how='left')
Out[37]:
Date ID Value Value_lag1
0 2005Q4 1 14 NaN
1 2006Q1 1 25 14
2 2006Q3 1 34 NaN
3 2005Q4 2 23 NaN
4 2006Q1 2 67 23
5 2006Q2 2 14 67
6 2006Q3 2 46 14
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.