简体   繁体   中英

Are element-wise operations faster with NumPy functions than operators?

I recently came across a great SO post in which a user suggests that numpy.sum is faster than Python's sum when it comes to dealing with NumPy arrays.

This made me think, are element-wise operations on NumPy arrays faster with NumPy functions than operators? If so, then why is this the case?

Consider the following example.

import numpy as np
a = np.random.random(1e10)
b = np.random.random(1e10)

Will np.subtract(a, b) be reliably faster than a - b ?

No, not in a significant way.

The reason np.sum is faster than sum is that sum is implemented to "naively" iterate over the iterable (in this case, the numpy array), calling elements' __add__ operator (which imposes a significant overhead), while numpy's implementation of sum is optimized, eg taking advantage of the fact it knows the type (dtype) of the elements, and that they are contiguous in memory.

This is not the case with np.subtract(arr1, arr2) and arr1-arr2 . The latter roughly translates to the former.

The difference arises from the fact one can override the subtraction operator in python, so numpy arrays override it to use the optimized version. The sum operation, however, is not overridable, so numpy provides an alternative optimized version of it.

Not really. You can check the timings pretty easily though.

a = np.random.normal(size=1000)
b = np.random.normal(size=1000)

%timeit np.subtract(a, b)
# 1000000 loops, best of 3: 1.57 µs per loop

%timeit a - b
# 1000000 loops, best of 3: 1.47 µs per loop

%timeit np.divide(a, b)
# 100000 loops, best of 3: 3.51 µs per loop

%timeit a / b
# 100000 loops, best of 3: 3.38 µs per loop

The numpy functions actually seem to be a tad slower. I'm not sure if that's significant, but I suspect it might be because of some additional function call overhead on top of the same implementation.

EDIT: As @unutbu notes, it's probably because np.add and friends have additional type-checking overhead to convert array-likes to arrays when necessary, so stuff like np.add([1, 2], [3, 4]) works.

Great answer by @shx2.

I will just expand slightly on sum vs. np.sum :

  • built-in sum will go through an array, take the elements one-by-one and convert them each to a Python object before adding them together as Python objects.
  • np.sum will sum the array using an optimized loop in native code, without any conversion of the individual values (as shx2 points out, this crucially requires homogeneity and contiguity of the array contents)

The conversion of every array element to a Python object is by far the major source of the overhead.

By the way, this also explains why it is dumb to use Python's standard-library C array type for math. sum(list) is a lot faster than sum(array.array) .

ab translates into a function call a.__rsub__(b) . Thus it uses the method that belongs to the variable (eg compiled numpy code if a is an array).

In [20]: a.__rsub__??
Type:       method-wrapper
String Form:<method-wrapper '__rsub__' of numpy.ndarray object at 0xad27a88>
Docstring:  x.__rsub__(y) <==> y-x

The doc for np.subtract(x1, x2[, out]) shows that it is a ufunc . ufunc often use the compiled operations like __rsub__ , but may add a bit of overhead to fit the ufunc protocol.

In a number of other cases np.foo(x, args) translates to x.foo(args) .

In general, if the functions and operators end up calling compiled numpy code to do the actual calculations, the timings will be quite similar, especially for large arrays.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM