Trying to create lags for pandas columns with column_names
Sample DF Code:
df = pd.DataFrame(np.random.randint(0,10,size=(4,2)))
df.shift(1)
OP:
0 1
0 NaN NaN
1 9.0 2.0
2 4.0 5.0
3 6.0 0.0
but when I try to create this with column names, i get nan
df1=pd.DataFrame(df.shift(1),columns=["lag"+str(each) for each in df.columns])
df1
OP:
lag0 lag1
0 NaN NaN
1 NaN NaN
2 NaN NaN
3 NaN NaN
Any suggestion to rectify this?
Here's another approach:
df = df.shift(1)
l = list(df.columns.astype('str'))
s = 'lag'
cols = [s + i for i in l]
df.columns = cols
df
lag0 lag1
0 NaN NaN
1 7.0 4.0
2 4.0 8.0
3 0.0 9.0
Problem is there are different columns names, so after created new DataFrame columns names not matched and are created misisng values, it is called index alignmenet.
For prevent it is possible convert values to numpy array:
df1=pd.DataFrame(df.shift(1).to_numpy(),columns=["lag"+str(each) for each in df.columns])
print (df1)
lag0 lag1
0 NaN NaN
1 2.0 2.0
2 8.0 3.0
3 6.0 8.0
But simplier is use DataFrame.add_prefix
:
df1 = df.shift().add_prefix('lag')
print (df1)
lag0 lag1
0 NaN NaN
1 1.0 1.0
2 8.0 3.0
3 0.0 4.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.