I would like to shift a column in a Pandas DataFrame<\/code> , but I haven't been able to find a method to do it from the documentation without rewriting the whole DF.
Does anyone know how to do it? DataFrame:
## x1 x2
##0 206 214
##1 226 234
##2 245 253
##3 265 272
##4 283 291
In [18]: a
Out[18]:
x1 x2
0 0 5
1 1 6
2 2 7
3 3 8
4 4 9
In [19]: a['x2'] = a.x2.shift(1)
In [20]: a
Out[20]:
x1 x2
0 0 NaN
1 1 5
2 2 6
3 3 7
4 4 8
You need to use df.shift
here.
df.shift(i)
shifts the entire dataframe by i
units down.
So, for i = 1
:
Input:
x1 x2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
Output:
x1 x2
0 Nan Nan
1 206 214
2 226 234
3 245 253
4 265 272
So, run this script to get the expected output:
import pandas as pd
df = pd.DataFrame({'x1': ['206', '226', '245',' 265', '283'],
'x2': ['214', '234', '253', '272', '291']})
print(df)
df['x2'] = df['x2'].shift(1)
print(df)
Lets define the dataframe from your example by
>>> df = pd.DataFrame([[206, 214], [226, 234], [245, 253], [265, 272], [283, 291]],
columns=[1, 2])
>>> df
1 2
0 206 214
1 226 234
2 245 253
3 265 272
4 283 291
Then you could manipulate the index of the second column by
>>> df[2].index = df[2].index+1
and finally re-combine the single columns
>>> pd.concat([df[1], df[2]], axis=1)
1 2
0 206.0 NaN
1 226.0 214.0
2 245.0 234.0
3 265.0 253.0
4 283.0 272.0
5 NaN 291.0
Perhaps not fast but simple to read. Consider setting variables for the column names and the actual shift required.
Edit: Generally shifting is possible by df[2].shift(1)
as already posted however would that cut-off the carryover.
If you don't want to lose the columns you shift past the end of your dataframe, simply append the required number first:
offset = 5
DF = DF.append([np.nan for x in range(offset)])
DF = DF.shift(periods=offset)
DF = DF.reset_index() #Only works if sequential index
I suppose imports
import pandas as pd
import numpy as np
First append new row with NaN, NaN,...
at the end of DataFrame ( df
).
s1 = df.iloc[0] # copy 1st row to a new Series s1
s1[:] = np.NaN # set all values to NaN
df2 = df.append(s1, ignore_index=True) # add s1 to the end of df
It will create new DF df2. Maybe there is more elegant way but this works.
Now you can shift it:
df2.x2 = df2.x2.shift(1) # shift what you want
Trying to answer a personal problem and similar to yours I found on Pandas Doc what I think would answer this question:
DataFrame.shift (periods=1, freq=None, axis=0) Shift index by desired number of periods with an optional time freq
Notes
If freq is specified then the index values are shifted but the data is not realigned. That is, use freq if you would like to extend the index when shifting and preserve the original data.
Hope to help future questions in this matter.
df3
1 108.210 108.231
2 108.231 108.156
3 108.156 108.196
4 108.196 108.074
... ... ...
2495 108.351 108.279
2496 108.279 108.669
2497 108.669 108.687
2498 108.687 108.915
2499 108.915 108.852
df3['yo'] = df3['yo'].shift(-1)
yo price
0 108.231 108.210
1 108.156 108.231
2 108.196 108.156
3 108.074 108.196
4 108.104 108.074
... ... ...
2495 108.669 108.279
2496 108.687 108.669
2497 108.915 108.687
2498 108.852 108.915
2499 NaN 108.852
This is how I do it:
df_ext = pd.DataFrame(index=pd.date_range(df.index[-1], periods=8, closed='right'))
df2 = pd.concat([df, df_ext], axis=0, sort=True)
df2["forecast"] = df2["some column"].shift(7)
Basically I am generating an empty dataframe with the desired index and then just concatenate them together. But I would really like to see this as a standard feature in pandas so I have proposed an enhancement to pandas.
I'm new to pandas, and I may not be understanding the question, but this solution worked for my problem:
# Shift contents of column 'x2' down 1 row
df['x2'] = df['x2'].shift()
Or, to create a new column with contents of 'x2' shifted down 1 row
# Create new column with contents of 'x2' shifted down 1 row
df['x3'] = df['x2'].shift()
I had a read of the official docs for shift() while trying to figure this out, but it doesn't make much sense to me, and has no examples referencing this specific behavior.
Note that the last row of column 'x2' is effectively pushed off the end of the Dataframe. I expected shift() to have a flag to change this behaviour, but I can't find anything.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.