简体   繁体   中英

Create new dataframe column from a shifted existing column

I have a dataframe with open, high, low, close prices of a stock. I want to add an additional column that has the percent change between today's open and yesterday's high. This is my current implementation, however, the resulting column contains percent changes between the current day's high and open.

df

  open  high  low  close
0 100   110   95   103
1 103   113   103   111
2 111   132   109   124
3 124   136   114   130

My attempt (incorrect):

df['prevhigh_curropen'] = (df['open'] - df['high']).shift(-1) / df['high'].shift(-1)

Output (incorrect):

  open  high  low  close  prevhigh_curropen
0 100   110   95   103    -0.091
1 103   113   103  111    -0.089
2 111   132   109  124    -0.159
3 124   136   114  130    -0.088

Desired output:

  open  high  low  close  prevhigh_curropen
0 100   110   95   103    nan
1 103   113   103  111    -0.064
2 111   132   109  124    -0.018
3 124   136   114  130    -0.061

Is there a non-iterative way to do this like I attempted above?

Your formula is wrong, you have to use df['high'].shift() :

df = pd.DataFrame({'open': range(1, 11), 'high': range(1, 11)})
df['prevhigh_curropen'] = df['open'].sub(df['high'].shift()) \
                                    .div(df['high'].shift()) \
                                    .mul(100)
>>> df
   open  high  prevhigh_curropen
0     1     1                NaN
1     2     2         100.000000
2     3     3          50.000000
3     4     4          33.333333
4     5     5          25.000000
5     6     6          20.000000
6     7     7          16.666667
7     8     8          14.285714
8     9     9          12.500000
9    10    10          11.111111

For your sample the output is:

>>> df
   open  high  low  close  prevhigh_curropen
0   100   110   95    103                NaN
1   103   113  103    111          -6.363636
2   111   132  109    124          -1.769912
3   124   136  114    130          -6.060606

The first value is NaN because we don't know the high value from the previous day.

We can simplify the terms slightly from (a - b) / b to (a / b) - (b / b) to (a / b) - 1 .

Mathematical Operators:

df['prevhigh_curropen'] = (df['open'] / df['high'].shift()) - 1

or with Series Methods:

df['prevhigh_curropen'] = df['open'].div(df['high'].shift()).sub(1)

*The benefit here is that we only need to shift once, and maintain 1 copy of df['high'].shift()

Resulting df :

   open  high  low  close  prevhigh_curropen
0   100   110   95    103                NaN
1   103   113  103    111          -0.063636
2   111   132  109    124          -0.017699
3   124   136  114    130          -0.060606

Setup Used:

import pandas as pd

df = pd.DataFrame({
    'open': [100, 103, 111, 124],
    'high': [110, 113, 132, 136],
    'low': [95, 103, 109, 114],
    'close': [103, 111, 124, 130]
})

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM