简体   繁体   English

为什么 numpy.vectorize() 改变标量函数的除法输出?

[英]Why is numpy.vectorize() changing the division output of a scalar function?

I'm obtaining a strange result when I vectorise a function with numpy.当我用 numpy.vectorize 函数向量化时,我得到了一个奇怪的结果。

import numpy as np
def scalar_function(x, y):
    """ A function that returns x*y if x<y and x/y otherwise
    """
    if x < y :
        out = x * y 
    else:
        out = x/y 
    return out

def vector_function(x, y):
    """
    Make it possible to accept vectors as input
    """
    v_scalar_function = np.vectorize(scalar_function)
    return v_scalar_function(x, y)

we do have我们确实有

scalar_function(4,3)
# 1.3333333333333333

Why is the vectorized version giving this strange output?为什么矢量化版本会给出这个奇怪的输出?

vector_function(np.array([3,4]), np.array([4,3]))
[12  1]

While this call to the vectorized version works fine:虽然这个对矢量化版本的调用工作正常:

vector_function(np.array([4,4]), np.array([4,3]))
[1.         1.33333333]

Reading numpy.divide :阅读numpy.divide

Notes The floor division operator // was added in Python 2.2 making // and / equivalent operators.注释 在 Python 2.2 中添加了楼层除法运算符 //,使 // 和 / 等效运算符。 The default floor division operation of / can be replaced by true division with from __future__ import division. / 的默认楼层除法操作可以用 from __future__ import Division 替换为真正的除法。 In Python 3.0, // is the floor division operator and / the true division operator.在 Python 3.0 中, // 是地板除法运算符和 / 真正的除法运算符。 The true_divide(x1, x2) function is equivalent to true division in Python. true_divide(x1, x2) 函数等效于 Python 中的真正除法。

Makes me think this might be a remaining issue related to python2?让我觉得这可能是与 python2 相关的遗留问题? But I'm using python 3!但我使用的是python 3!

The docs for numpy.vectorize state: numpy.vectorize状态的文档:

The output type is determined by evaluating the first element of the input, unless it is specified输出类型通过评估输入的第一个元素来确定,除非指定

Since you did not specify a return data type, and the first example is integer multiplication, the first array is also of integer type and rounds the values.由于您没有指定返回数据类型,并且第一个示例是整数乘法,因此第一个数组也是整数类型并舍入值。 Conversely, when the first operation is division, the datatype is automatically upcasted to float.相反,当第一个操作是除法时,数据类型会自动向上转换为浮点数。 You can fix your code by specifying a dtype in vector_function (which doesn't necessarily have to be as big as 64-bit for this problem):您可以通过在vector_function指定一个vector_function来修复您的代码(对于此问题,它不一定必须与 64 位一样大):

def vector_function(x, y):
    """
    Make it possible to accept vectors as input
    """
    v_scalar_function = np.vectorize(scalar_function, otypes=[np.float64])
    return v_scalar_function(x, y)

Separately, you should also make note from that very same documentation that numpy.vectorize is a convenience function and basically just wraps a Python for loop so is not vectorized in the sense that it provides any real performance gains.另外,您还应该从相同的文档中注意到numpy.vectorize是一个方便的函数,基本上只是包装了一个 Python for循环,因此在它提供任何真正的性能提升的意义上不是矢量化的。

For a binary choice like this, a better overall approach would be:对于这样的二元选择,更好的整体方法是:

def vectorized_scalar_function(arr_1, arr_2):
    return np.where(arr_1 < arr_2, arr_1 * arr_2, arr_1 / arr_2)

print(vectorized_scalar_function(np.array([4,4]), np.array([4,3])))
print(vectorized_scalar_function(np.array([3,4]), np.array([4,3])))

The above should be orders of magnitude faster and (possibly coincidentally rather than a hard-and-fast rule to rely on) doesn't suffer the type casting issue for the result.上面的应该快几个数量级,并且(可能是巧合而不是可靠的硬性规则)不会遭受结果的类型转换问题。

Checking which statemets are triggered:检查触发了哪些statemets:

import numpy as np

def scalar_function(x, y):
    """ A function that returns x*y if x<y and x/y otherwise
    """
    if x < y :
        print('if x: ',x)
        print('if y: ',y)
        out = x * y 
        print('if out', out)
    else:
        print('else x: ',x)
        print('else y: ',y)
        out = x/y
        print('else out', out)

    return out

def vector_function(x, y):
    """
    Make it possible to accept vectors as input
    """
    v_scalar_function = np.vectorize(scalar_function)
    return v_scalar_function(x, y)


vector_function(np.array([3,4]), np.array([4,3]))

if x:  3
if y:  4
if out 12
if x:  3
if y:  4
if out 12
else x:  4
else y:  3
else out 1.3333333333333333 # <-- seems that the value is calculated correctly, but the wrong dtype is returned

So, you can rewrite the scalar function:因此,您可以重写标量函数:

def scalar_function(x, y):
    """ A function that returns x*y if x<y and x/y otherwise
    """
    if x < y :
        out = x * y 
    else:
        out = x/y
    return float(out)


vector_function(np.array([3,4]), np.array([4,3]))
array([12.        ,  1.33333333])

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM