[英]python pandas dataframe, operations on values
I am trying to understand how Pandas DataFrames works to copy information downward, and then reset until the next variables changes... Specifically below, how do I make Share_Amt_To_Buy reset to 0 once my Signal or Signal_Diff switches from 1 to 0? 我试图了解Pandas DataFrames如何向下复制信息,然后重置直到下一个变量更改...具体在下面,一旦我的Signal或Signal_Diff从1切换为0,如何将Share_Amt_To_Buy重置为0?
Using .cumsum() on Share_Amt_To_Buy ends up bringing down the values and accumulating which is not exactly what I would like to do. 在Share_Amt_To_Buy上使用.cumsum()最终会降低这些值并累积不完全是我想要执行的操作。
My goal is that when Signal changes from 0 to 1, the Share_Amt_To_Buy is calculated and copied until Signal switches back to 0. Then if Signal turns to 1 again, I want Share_Amt_To_Buy to be recalculated based on that point in time. 我的目标是,当Signal从0变为1时,将计算并复制Share_Amt_To_Buy,直到Signal切换回0。然后,如果Signal再次变为1,我希望基于该时间点重新计算Share_Amt_To_Buy。
Hopefully this makes sense - please let me know. 希望这是有道理的-请让我知道。
Signal Signal_Diff Share_Amt_To_Buy (Correctly) Share_Amt_To_Buy (Currently)
0 0 0 0
0 0 0 0
0 0 0 0
1 1 100 100
1 0 100 100
1 0 100 100
0 -1 0 100
0 0 0 100
1 1 180 280
1 0 180 280
As you can see, my signals alternate from 0 to 1, and this means the following: 0 = no trade (or position) 1 = trade (with a position) 如您所见,我的信号从0到1交替变化,这意味着:0 =无交易(或头寸)1 =交易(有头寸)
Signal_Diff is calculated as follows Signal_Diff计算如下
portfolio['Signal_Diff'] = portfolio['Signal'].diff().fillna(0.0)
The column 'Share_Amt_To_Buy' is calculated when signal changes from 0 to 1. I have used the following as an example to calculate this 当信号从0变为1时,将计算“ Share_Amt_To_Buy”列。我以以下示例为例进行计算
initial_cap = 100000.0
portfolio['close'] = my stock's closing prices as a float
portfolio['Share_Amt'] = np.where(variables['Signal']== 1.0, np.round(initial_cap / portfolio['close'] * 0.25 * portfolio['Signal']), 0.0).cumsum()
portfolio['Share_Amt_To_Buy'] = (portfolio['Share_Amt']*portfolio['Signal'])
From what I understand, there is no built-in formula module for pandas. 据我了解,大熊猫没有内置的公式模块。 You can perform formulas on columns, cells, arrays and generate different arrays or values from them (df[column].count() is an example), and do plenty of work like that, but there is no method for dynamically updating the array itself based on another value in the array (like an Excel formula).
您可以在列,单元格,数组上执行公式,并从中生成不同的数组或值(df [column] .count()是示例),并且可以做很多类似的工作,但是没有动态更新数组的方法本身基于数组中的另一个值(例如Excel公式)。
You could always do the procedure iteratively and say: 您总是可以迭代执行该过程,然后说:
>>> for index in df.index:
>>> if df['Signal_Diff'] == 0:
>>> df.loc[index, 'Signal_Diff'] = some_value
>>> elif df['Signal_Diff'] == 1:
>>> df.loc[index, 'Signal_Diff'] = some_other_value
Or you could create a custom function via the map tool: https://stackoverflow.com/a/19226745/4131059 或者,您可以通过地图工具创建自定义函数: https : //stackoverflow.com/a/19226745/4131059
EDIT: 编辑:
Another solution would be to query for all indexes with a value of 1 in the old array and the new array upon some change to the array: 另一个解决方案是在对旧数组和新数组进行一些更改后,在旧数组和新数组中查询值为1的所有索引:
>>> df_old_list = df[df.Signal_Diff == 1].index.tolist()
>>> ...
>>> df_new_list = df[df.Signal_Diff == 1].index.tolist()
>>>
>>> for x in df_old_list:
>>> if x in df_new_list:
>>> df_new_list.remove(x)
Then recalculate for only the indexes in df_new_list. 然后仅重新计算df_new_list中的索引。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.