[英]iterrate over dataframe and based on the value of one column do operations in a new column with previous row's value
I have small df of stock prices with their actions.我对他们的行为有很小的了解。 I would like to calculate the adjust ownership amount of a stock after the split (ie if you own a 1000 share and stock have 2-1 split then your ownership become 2000 share).
我想计算拆分后股票的调整所有权数量(即,如果您拥有 1000 股并且股票有 2-1 拆分,那么您的所有权变为 2000 股)。 I would like to iterrate over "Stock Splits" column and if the value.= 0 then multiply "ownership" with "Stock Splits" otherwise maintain the last quantity before the split, I tried many methods.
我想迭代“Stock Splits”列,如果 value.= 0 然后将“ownership”乘以“Stock Splits”,否则保持拆分前的最后数量,我尝试了很多方法。 but I am not sure where I am going wrong - i do think the logic is wrong but don't know how to fix it.
但我不确定我哪里出错了——我确实认为逻辑是错误的,但不知道如何解决。
import yfinance as yf
aapl = yf.Ticker("AAPL")
hist = aapl.history(start="2014-06-01")
hist["ownership"] = 1000
Open High Low Close Volume Dividends Stock Splits ownership
Date
2014-06-02 20.338966 20.366877 19.971301 20.168608 369350800 0.0 0.0 1000
2014-06-03 20.162511 20.492319 20.155774 20.453819 292709200 0.0 0.0 1000
2014-06-04 20.450610 20.785872 20.407940 20.687378 335482000 0.0 0.0 1000
2014-06-05 20.731655 20.833356 20.616479 20.768549 303805600 0.0 0.0 1000
2014-06-06 20.850357 20.893990 20.676150 20.711439 349938400 0.0 0.0 1000
my codes is as follow:我的代码如下:
hist.loc[hist['Stock Splits']==0,'ownerAdj'] = hist['ownership'].shift(1)
hist.loc[hist['Stock Splits']!=0,'ownerAdj'] = hist['ownership'].shift(1) * hist['Stock Splits']
However I am not always getting correct figures, like in the below example, in 2014-06-09 aapl had split (7 to 1) so the results should be 7000 from 2014-06-09 until the next date it has another split which is 2020-08-31 but I am getting back the 1000 after the split然而,我并不总是得到正确的数字,如下例所示,在 2014-06-09 aapl 已经分裂(7 比 1)所以从 2014-06-09 到下一个日期它有另一个分裂的结果应该是 7000是 2020-08-31 但拆分后我要取回 1000
Date Open High Low Close Volume Dividends Stock Splits ownership ownerAdj
0 2014-06-02 20.338964 20.366875 19.971299 20.168606 369350800 0.0 0.0 1000 NaN
1 2014-06-03 20.162515 20.492323 20.155778 20.453823 292709200 0.0 0.0 1000 1000.0
2 2014-06-04 20.450608 20.785870 20.407938 20.687376 335482000 0.0 0.0 1000 1000.0
3 2014-06-05 20.731645 20.833346 20.616470 20.768539 303805600 0.0 0.0 1000 1000.0
4 2014-06-06 20.850359 20.893992 20.676152 20.711441 349938400 0.0 0.0 1000 1000.0
5 2014-06-09 20.818268 21.083269 20.604921 21.042845 301660000 0.0 7.0 1000 7000.0
6 2014-06-10 21.274162 21.346027 21.013652 21.166365 251108000 0.0 0.0 1000 1000.0
7 2014-06-11 21.139424 21.280908 20.991204 21.078789 182724000 0.0 0.0 1000 1000.0
I tried to run loop but I am getting error:我试图运行循环,但出现错误:
for i, row in hist.iterrows():
if row["Stock Splits"] == 0:
row["ownerAdj"] = row["ownership"].shift(1)
elif row["Stock Splits"] != 0:
row["ownerAdj"] = row["ownership"].shift(1) * row["Stock Splits"]
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-51-2d94c5e86953> in <module>
1 for i, row in hist.iterrows():
2 if row["Stock Splits"] == 0:
----> 3 row["adjust2"] = row["ownership"].shift(1)
4 elif row["Stock Splits"] != 0:
5 row["adjust2"] = row["ownership"].shift(1) * row["Stock Splits"]
AttributeError: 'numpy.float64' object has no attribute 'shift'
You can do this vectorized你可以做这个矢量化
hist['ownership'] = 1000 * np.cumprod(np.maximum(hist["Stock Splits"], 1))
In parts:在部分:
# No split can be expressed as a 1.0 split (You get 1 for every 1).
# Assumes you don't have negative splits.
adj_split = np.maximum(hist["Stock Splits"], 1)
# The multiple of the initial ownership at each day compared to the first.
cumsplit = np.cumprod(adj_split)
initial_ownership = 1000
hist["ownership"] = cumsplit * initial_ownership
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.