简体   繁体   English

Numpy:1:many 操作上下文中的向量化

[英]Numpy: Vectorization in the context of a 1:many operation

Suppose I have the following functions defined only to create the function topology the question is about:假设我定义了以下函数,仅用于创建问题所涉及的函数拓扑:

def foo(x,y):
  return np.asarray([x for i in range(y)])

bar = lambda x: foo(x,10)
barv = np.vectorize(bar)

z = np.asarray([1, 2, 3])

And the following routine:以及以下例程:

for i in range(z.shape[0]):
  rng = np.arange(z[i],100)
  # res = barv(rng)
  res = np.asarray(list(map(bar,rng)))

The above routine works.上述例程有效。 However, if I uncomment and run the vectorized version, ie:但是,如果我取消注释并运行矢量化版本,即:

for i in range(z.shape[0]):
  rng = np.arange(z[i],100)
  res = barv(rng)

The code fails with the follwing error:代码失败并出现以下错误:

Traceback (most recent call last):
  File "C:\ProgramData\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-660195661e55>", line 3, in <module>
    res = barv(rng)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2091, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2170, in _vectorize_call
    res = array(outputs, copy=False, subok=True, dtype=otypes[0])
ValueError: setting an array element with a sequence.

The error makes sense.错误是有道理的。 However, there must be some way to do a vectorized 1:many operation in numpy?但是,一定有某种方法可以在 numpy 中进行矢量化 1:many 操作吗?

vectorize is meant to be used with scalar functions, ones that take scalar inputs, and return scalar output. vectorize旨在与scalar函数一起使用,这些函数接受标量输入并返回标量输出。

In [729]: foo(z,10)                                                                              
Out[729]: 
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])
In [730]: bar(z)                                                                                 
Out[730]: 
array([[1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3],
       [1, 2, 3]])

Your bar is returns a 2d array if given a 1d input.如果给定 1d 输入,您的bar将返回一个 2d 数组。 Or a 1d array if given a scalar input如果给定标量输入,则为一维数组

In [734]: bar(4)                                                                                 
Out[734]: array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4])

We can tell vectorize to expect an object return,我们可以告诉vectorize期待一个object返回,

In [735]: barv = np.vectorize(bar, otypes=[object])                                              
In [736]: barv(4)                                                                                
Out[736]: array([4, 4, 4, 4, 4, 4, 4, 4, 4, 4], dtype=object)
In [737]: barv(z)                                                                                
Out[737]: 
array([array([1, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
       array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2]),
       array([3, 3, 3, 3, 3, 3, 3, 3, 3, 3])], dtype=object)

which can be turned into a 2d array with:它可以变成一个二维数组:

In [738]: np.stack(_)                                                                            
Out[738]: 
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])

vectorize also has a signature parameter, that might help in this case. vectorize还有一个签名参数,在这种情况下可能会有所帮助。 But in my experience it's even slower.但根据我的经验,它甚至更慢。

But we don't need vectorize here - a simple list comprehension is just as good, probably better:但是我们在这里不需要vectorize ——一个简单的列表理解同样好,可能更好:

In [739]: np.stack([bar(i) for i in z])                                                          
Out[739]: 
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])

vectorize makes 'broadcasting' with several input arrays easier, but it does not improve speed. vectorize使使用多个输入数组的“广播”更容易,但并没有提高速度。 See if you can figure out what vectorize is doing here with foo :看看你是否能弄清楚vectorize在这里用foo做什么:

In [743]: f = np.vectorize(foo, otypes=[object])                                                 
In [744]: f(np.array([1,2,3]), np.array([2,3,4]))                                                
Out[744]: array([array([1, 1]), array([2, 2, 2]), array([3, 3, 3, 3])], dtype=object)
In [745]: f(np.array([1,2,3]), np.array([[2],[3]]))                                              
Out[745]: 
array([[array([1, 1]), array([2, 2]), array([3, 3])],
       [array([1, 1, 1]), array([2, 2, 2]), array([3, 3, 3])]],
      dtype=object)

edit编辑

Correct numpy vectorization:正确的numpy向量化:

In [762]: np.repeat(z[:,None],10,1)                                                              
Out[762]: 
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])

Some time comparisons:部分时间对比:

In [766]: timeit np.stack(barv(z))                                                               
60 µs ± 1.35 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [767]: timeit np.stack([bar(i) for i in z])                                                   
39.6 µs ± 132 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [768]: timeit np.repeat(z[:,None],10,1)                                                       
4.12 µs ± 14.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

Your foo already works with an array input.您的foo已经使用数组输入。 There was no need for a np.vectorize wrapper.不需要np.vectorize包装器。

In [783]: foo(z,10).T                                                                            
Out[783]: 
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2, 2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3, 3, 3, 3, 3, 3]])
In [784]: timeit foo(z,10).T                                                                     
10.8 µs ± 364 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM