[英]pandas dividing a column by lagged values
I'm trying to divide a Pandas DataFrame column by a lagged value, which is 1 in this example. 我正在尝试将Pandas DataFrame列除以滞后值,在此示例中为1。
Create the dataframe. 创建数据框。 This example only has 1 column, even though my real data has dozens 这个例子只有1列,即使我的真实数据有几十列
dTest = pd.DataFrame(data={'Open': [0.99355, 0.99398, 0.99534, 0.99419]})
When I try this vector division (I'm a Python newbie coming from R): 当我尝试这个矢量分区时(我是来自R的Python新手):
dTest.ix[range(1,4),'Open'] / dTest.ix[range(0,3),'Open']
I get this output: 我得到这个输出:
NaN 1 1 NaN NaN 11 NaN
But I'm expecting: 但我期待:
1.0004327915052085 1.0013682367854484 0.9988446159101413 1.0004327915052085 1.0013682367854484 0.9988446159101413
There's clearly something that I don't understand about the data structure. 很明显,我对数据结构并不了解。 I'm expecting 3 values but it's outputting 4. What am I missing? 我期待3个值,但它输出4.我缺少什么?
What you tried failed because the sliced ranges of the indices only overlap on the middle 2 rows. 您尝试失败的原因是索引的切片范围仅在中间2行重叠。 You should use shift
to shift the rows to achieve what you want: 你应该使用shift
来移动行来实现你想要的:
In [166]:
dTest['Open'] / dTest['Open'].shift()
Out[166]:
0 NaN
1 1.000433
2 1.001368
3 0.998845
Name: Open, dtype: float64
you can also use div
: 你也可以使用div
:
In [159]:
dTest['Open'].div(dTest['Open'].shift(), axis=0)
Out[159]:
0 NaN
1 1.000433
2 1.001368
3 0.998845
Name: Open, dtype: float64
You can see that the indices are different when you slice so when using /
only the common indices are affected: 您可以看到切片时索引是不同的,所以当使用/
只有公共索引受到影响时:
In [164]:
dTest.ix[range(0,3),'Open']
Out[164]:
0 0.99355
1 0.99398
2 0.99534
Name: Open, dtype: float64
In [165]:
dTest.ix[range(1,4),'Open']
Out[165]:
1 0.99398
2 0.99534
3 0.99419
Name: Open, dtype: float64
here: 这里:
In [168]:
dTest.ix[range(0,3),'Open'].index.intersection(dTest.ix[range(1,4),'Open'].index
Out[168]:
Int64Index([1, 2], dtype='int64')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.