简体   繁体   English

Python Pandas:矢量化操作错误?

[英]Python Pandas: Vectorized operation bug?

Date series looks something like this. 日期系列看起来像这样。

In [89]:
db.close[:5]

Out[89]:
datetime
2012-06-28 23:58:00    1.243925
2012-06-28 23:59:00    1.244125
2012-06-29 00:00:00    1.244065
2012-06-29 00:01:00    1.243875
2012-06-29 00:02:00    1.243865
Name: close

I would like to subtract previous element from each element. 我想从每个元素中减去前一个元素。

In [93]:
db.close[1:5] - db.close[:4]

Out[93]:
datetime
2012-06-28 23:58:00   NaN
2012-06-28 23:59:00     0
2012-06-29 00:00:00     0
2012-06-29 00:01:00     0
2012-06-29 00:02:00   NaN
Name: close

Arrays was subtract without offset. 数组没有偏移地减去。

But when I compare the array elements... 但是当我比较数组元素时......

n [94]: n [94]:

db.close[1:5] == db.close[:4]
Out[94]:
datetime
2012-06-28 23:59:00    False
2012-06-29 00:00:00    False
2012-06-29 00:01:00    False
2012-06-29 00:02:00    False
Name: close

This is actually deliberate. 这实际上是故意的。 Arithmetic operations do data alignment, but comparisons do not. 算术运算进行数据对齐,但比较则不然。 I considered changing it in the past but found that it caused too many problems (especially when passing Series to functions expecting NumPy arrays, as an example, numpy.diff ). 我考虑过去更改它但发现它引起了太多问题(特别是在将Series传递给期望NumPy数组的函数时,例如numpy.diff )。

EDIT: to get alignment, you can do the alignment by hand: 编辑:要获得对齐,您可以手动进行对齐:

In [10]: numpy.equal(*a.align(b))
Out[10]: 
2000-01-03    False
2000-01-04     True
2000-01-05     True
2000-01-06     True
2000-01-07     True
2000-01-10    False
Freq: B

I found the answer to my own question. 我找到了自己问题的答案。 It may be useful to someone. 它可能对某人有用。

In [38]:
db.close.shift(periods=1).head() - db.close.head()

Out[38]:
datetime
2012-06-28 23:58:00        NaN
2012-06-28 23:59:00   -0.00020
2012-06-29 00:00:00    0.00006
2012-06-29 00:01:00    0.00019
2012-06-29 00:02:00    0.00001
Freq: T, Name: close

Unfortunately, it is 2-3x slower than normal arithmetic operations. 不幸的是,它比正常的算术运算慢2-3倍。

In [40]:

%timeit db.close.shift(periods=1) - db.close
1000 loops, best of 3: 1.64 ms per loop

%timeit db.close - db.open
1000 loops, best of 3: 700 us per loop

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM