简体   繁体   中英

Column names with pandas shift

Trying to create lags for pandas columns with column_names

Sample DF Code:

df = pd.DataFrame(np.random.randint(0,10,size=(4,2)))
df.shift(1)

OP:

     0   1
0   NaN NaN
1   9.0 2.0
2   4.0 5.0
3   6.0 0.0

but when I try to create this with column names, i get nan

df1=pd.DataFrame(df.shift(1),columns=["lag"+str(each) for each in df.columns])
df1

OP:

    lag0  lag1
0   NaN   NaN
1   NaN   NaN
2   NaN   NaN
3   NaN   NaN

Any suggestion to rectify this?

Here's another approach:

df = df.shift(1)

l = list(df.columns.astype('str'))
s = 'lag'
cols = [s + i for i in l]
df.columns = cols

df
    lag0    lag1
0   NaN     NaN
1   7.0     4.0
2   4.0     8.0
3   0.0     9.0

Problem is there are different columns names, so after created new DataFrame columns names not matched and are created misisng values, it is called index alignmenet.

For prevent it is possible convert values to numpy array:

df1=pd.DataFrame(df.shift(1).to_numpy(),columns=["lag"+str(each) for each in df.columns])
print (df1)
   lag0  lag1
0   NaN   NaN
1   2.0   2.0
2   8.0   3.0
3   6.0   8.0

But simplier is use DataFrame.add_prefix :

df1 = df.shift().add_prefix('lag')
print (df1)
   lag0  lag1
0   NaN   NaN
1   1.0   1.0
2   8.0   3.0
3   0.0   4.0

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM