I am trying to prepare data for some time-series modeling with Python
Pandas
(first timer). My DataFrame
looks like this:
df = pd.DataFrame({
'time': [0, 1, 2, 3, 4],
'colA': ['a', 'b', 'c', 'd', 'e'],
'colB': ['v', 'w', 'x', 'y', 'z'],
'value' : [10, 11, 12, 13, 14]
})
# time colA colB value
# 0 0 a v 10
# 1 1 b w 11
# 2 2 c x 12
# 3 3 d y 13
# 4 4 e z 14
Is there a combination of functions that could transform it into the following format?
# colA-2 colA-1 colA colB-2 colB-1 colB value
# _ _ a _ _ v 10
# _ a b _ v w 11
# a b c v w x 12
# b c d w x y 13
# c d e x y z 14
I am very new to Python/Pandas and I do not have any concrete code/results that got me even close to what I need...
You can use the shift function:
df['colA-2'] =df['colA'].shift(2, fill_value='-' )
df['colA-1'] =df['colA'].shift(1,fill_value='-')
...
I'd use pd.concat
pd.concat([
df[['colA', 'colB']].shift(i).add_suffix(f'-{i}')
for i in range(1, 3)], axis=1
).fillna('-').join(df)
colA-1 colB-1 colA-2 colB-2 time colA colB value
0 - - - - 0 a v 10
1 a v - - 1 b w 11
2 b w a v 2 c x 12
3 c x b w 3 d y 13
4 d y c x 4 e z 14
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.