简体   繁体   English

迭代 dataframe 并根据一列的值在具有前一行值的新列中执行操作

[英]iterrate over dataframe and based on the value of one column do operations in a new column with previous row's value

I have small df of stock prices with their actions.我对他们的行为有很小的了解。 I would like to calculate the adjust ownership amount of a stock after the split (ie if you own a 1000 share and stock have 2-1 split then your ownership become 2000 share).我想计算拆分后股票的调整所有权数量(即,如果您拥有 1000 股并且股票有 2-1 拆分,那么您的所有权变为 2000 股)。 I would like to iterrate over "Stock Splits" column and if the value.= 0 then multiply "ownership" with "Stock Splits" otherwise maintain the last quantity before the split, I tried many methods.我想迭代“Stock Splits”列,如果 value.= 0 然后将“ownership”乘以“Stock Splits”,否则保持拆分前的最后数量,我尝试了很多方法。 but I am not sure where I am going wrong - i do think the logic is wrong but don't know how to fix it.但我不确定我哪里出错了——我确实认为逻辑是错误的,但不知道如何解决。

import yfinance as yf
aapl = yf.Ticker("AAPL")
hist = aapl.history(start="2014-06-01")
hist["ownership"] = 1000


    Open    High    Low Close   Volume  Dividends   Stock Splits    ownership
Date                                
2014-06-02  20.338966   20.366877   19.971301   20.168608   369350800   0.0 0.0 1000
2014-06-03  20.162511   20.492319   20.155774   20.453819   292709200   0.0 0.0 1000
2014-06-04  20.450610   20.785872   20.407940   20.687378   335482000   0.0 0.0 1000
2014-06-05  20.731655   20.833356   20.616479   20.768549   303805600   0.0 0.0 1000
2014-06-06  20.850357   20.893990   20.676150   20.711439   349938400   0.0 0.0 1000 

my codes is as follow:我的代码如下:

 hist.loc[hist['Stock Splits']==0,'ownerAdj'] = hist['ownership'].shift(1)
hist.loc[hist['Stock Splits']!=0,'ownerAdj'] = hist['ownership'].shift(1) * hist['Stock Splits']

However I am not always getting correct figures, like in the below example, in 2014-06-09 aapl had split (7 to 1) so the results should be 7000 from 2014-06-09 until the next date it has another split which is 2020-08-31 but I am getting back the 1000 after the split然而,我并不总是得到正确的数字,如下例所示,在 2014-06-09 aapl 已经分裂(7 比 1)所以从 2014-06-09 到下一个日期它有另一个分裂的结果应该是 7000是 2020-08-31 但拆分后我要取回 1000

Date    Open    High    Low Close   Volume  Dividends   Stock Splits    ownership   ownerAdj
0   2014-06-02  20.338964   20.366875   19.971299   20.168606   369350800   0.0 0.0 1000    NaN
1   2014-06-03  20.162515   20.492323   20.155778   20.453823   292709200   0.0 0.0 1000    1000.0
2   2014-06-04  20.450608   20.785870   20.407938   20.687376   335482000   0.0 0.0 1000    1000.0
3   2014-06-05  20.731645   20.833346   20.616470   20.768539   303805600   0.0 0.0 1000    1000.0
4   2014-06-06  20.850359   20.893992   20.676152   20.711441   349938400   0.0 0.0 1000    1000.0
5   2014-06-09  20.818268   21.083269   20.604921   21.042845   301660000   0.0 7.0 1000    7000.0
6   2014-06-10  21.274162   21.346027   21.013652   21.166365   251108000   0.0 0.0 1000    1000.0
7   2014-06-11  21.139424   21.280908   20.991204   21.078789   182724000   0.0 0.0 1000    1000.0

I tried to run loop but I am getting error:我试图运行循环,但出现错误:

for i, row in hist.iterrows():
    if row["Stock Splits"] == 0:
        row["ownerAdj"] = row["ownership"].shift(1)
    elif row["Stock Splits"] != 0:
        row["ownerAdj"] = row["ownership"].shift(1) * row["Stock Splits"]

 ---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-51-2d94c5e86953> in <module>
      1 for i, row in hist.iterrows():
      2     if row["Stock Splits"] == 0:
----> 3         row["adjust2"] = row["ownership"].shift(1)
      4     elif row["Stock Splits"] != 0:
      5         row["adjust2"] = row["ownership"].shift(1) * row["Stock Splits"]

AttributeError: 'numpy.float64' object has no attribute 'shift'

You can do this vectorized你可以做这个矢量化

hist['ownership'] = 1000 * np.cumprod(np.maximum(hist["Stock Splits"], 1))

In parts:在部分:

# No split can be expressed as a 1.0 split (You get 1 for every 1).
# Assumes you don't have negative splits.
adj_split = np.maximum(hist["Stock Splits"], 1)  

# The multiple of the initial ownership at each day compared to the first.
cumsplit = np.cumprod(adj_split)

initial_ownership = 1000
hist["ownership"] = cumsplit * initial_ownership

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Pandas Dataframe基于前一行,将值添加到新列,但该列的最大值限于该列 - Pandas Dataframe Add a value to a new Column based on the previous row limited to the maximum value in that column Python Pandas Dataframe 根据同一列中的前一行值计算新行值 - Python Pandas Dataframe calculating new row value based on previous row value within same column 如何根据 Pandas dataframe 中上一行的行值创建新列? - How to create a new column based on row value in previous row in Pandas dataframe? 根据下一行的列值计算 dataframe 中的新列值 - Calculating new column value in dataframe based on next row's column value 如何遍历行并将基于上一行的值插入新列 - How to Iterate over rows and insert a value based on previous row to a new column 根据上一个行值创建一个新列并删除当前行 - Create a new column based on previous row value and delete the current row 我如何根据列单元格值和 append 查找一个 dataframe 上的一行到另一个 dataframe 上的一行? - How do i lookup a row on one dataframe based on the column cell value and append that to a row on another dataframe? 如果一个数据框的行值在另一数据框的列中,则创建一个新列并获取该索引 - Create a new column if one dataframe's row value is in another data frame's column and get that index Spark使用上一行的值将新列添加到数据框 - Spark add new column to dataframe with value from previous row 根据其他列的值将新列添加到数据框 - Adding new Column(s) to a dataframe based on value from other column
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM