[英]Why is numpy.vectorize() changing the division output of a scalar function?
I'm obtaining a strange result when I vectorise a function with numpy.当我用 numpy.vectorize 函数向量化时,我得到了一个奇怪的结果。
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
we do have我们确实有
scalar_function(4,3)
# 1.3333333333333333
Why is the vectorized version giving this strange output?为什么矢量化版本会给出这个奇怪的输出?
vector_function(np.array([3,4]), np.array([4,3]))
[12 1]
While this call to the vectorized version works fine:虽然这个对矢量化版本的调用工作正常:
vector_function(np.array([4,4]), np.array([4,3]))
[1. 1.33333333]
Reading numpy.divide :阅读numpy.divide :
Notes The floor division operator // was added in Python 2.2 making // and / equivalent operators.
注释 在 Python 2.2 中添加了楼层除法运算符 //,使 // 和 / 等效运算符。 The default floor division operation of / can be replaced by true division with from
__future__
import division./ 的默认楼层除法操作可以用 from
__future__
import Division 替换为真正的除法。 In Python 3.0, // is the floor division operator and / the true division operator.在 Python 3.0 中, // 是地板除法运算符和 / 真正的除法运算符。 The true_divide(x1, x2) function is equivalent to true division in Python.
true_divide(x1, x2) 函数等效于 Python 中的真正除法。
Makes me think this might be a remaining issue related to python2?让我觉得这可能是与 python2 相关的遗留问题? But I'm using python 3!
但我使用的是python 3!
The docs for numpy.vectorize
state: numpy.vectorize
状态的文档:
The output type is determined by evaluating the first element of the input, unless it is specified
输出类型通过评估输入的第一个元素来确定,除非指定
Since you did not specify a return data type, and the first example is integer multiplication, the first array is also of integer type and rounds the values.由于您没有指定返回数据类型,并且第一个示例是整数乘法,因此第一个数组也是整数类型并舍入值。 Conversely, when the first operation is division, the datatype is automatically upcasted to float.
相反,当第一个操作是除法时,数据类型会自动向上转换为浮点数。 You can fix your code by specifying a dtype in
vector_function
(which doesn't necessarily have to be as big as 64-bit for this problem):您可以通过在
vector_function
指定一个vector_function
来修复您的代码(对于此问题,它不一定必须与 64 位一样大):
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function, otypes=[np.float64])
return v_scalar_function(x, y)
Separately, you should also make note from that very same documentation that numpy.vectorize
is a convenience function and basically just wraps a Python for
loop so is not vectorized in the sense that it provides any real performance gains.另外,您还应该从相同的文档中注意到
numpy.vectorize
是一个方便的函数,基本上只是包装了一个 Python for
循环,因此在它提供任何真正的性能提升的意义上不是矢量化的。
For a binary choice like this, a better overall approach would be:对于这样的二元选择,更好的整体方法是:
def vectorized_scalar_function(arr_1, arr_2):
return np.where(arr_1 < arr_2, arr_1 * arr_2, arr_1 / arr_2)
print(vectorized_scalar_function(np.array([4,4]), np.array([4,3])))
print(vectorized_scalar_function(np.array([3,4]), np.array([4,3])))
The above should be orders of magnitude faster and (possibly coincidentally rather than a hard-and-fast rule to rely on) doesn't suffer the type casting issue for the result.上面的应该快几个数量级,并且(可能是巧合而不是可靠的硬性规则)不会遭受结果的类型转换问题。
Checking which statemets are triggered:检查触发了哪些statemets:
import numpy as np
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
print('if x: ',x)
print('if y: ',y)
out = x * y
print('if out', out)
else:
print('else x: ',x)
print('else y: ',y)
out = x/y
print('else out', out)
return out
def vector_function(x, y):
"""
Make it possible to accept vectors as input
"""
v_scalar_function = np.vectorize(scalar_function)
return v_scalar_function(x, y)
vector_function(np.array([3,4]), np.array([4,3]))
if x: 3
if y: 4
if out 12
if x: 3
if y: 4
if out 12
else x: 4
else y: 3
else out 1.3333333333333333 # <-- seems that the value is calculated correctly, but the wrong dtype is returned
So, you can rewrite the scalar function:因此,您可以重写标量函数:
def scalar_function(x, y):
""" A function that returns x*y if x<y and x/y otherwise
"""
if x < y :
out = x * y
else:
out = x/y
return float(out)
vector_function(np.array([3,4]), np.array([4,3]))
array([12. , 1.33333333])
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.