I am confused why test2
is not faster than test1
in the following code:
import timeit
setup = """
import numpy as np
A = np.ones((220, 220, 220))
B = np.ones((220, 220, 220))
class store:
def __init__(self):
self.C = np.empty((220, 220, 220))
Z = store()
"""
test1 = """
C = A + B
"""
test2 = """
Z.C = A + B
"""
print timeit.timeit(test1, setup, number=1000)
print timeit.timeit(test2, setup, number=1000)
which gave me: 40.9241290092 40.7675480843
I thought because ZC
was preallocated memory, there would be less overhead every time I added A+B
and needed a place to store it, ie less calls to malloc
behind the scenes or something like that. What am I missing?
Allocation is a fast operation, addition is more expensive:
In [7]: %timeit np.empty((220, 220, 220))
1000 loops, best of 3: 472 µs per loop
In [8]: u= np.ones((220, 220, 220))
In [9]: %timeit u+u
10 loops, best of 3: 73.5 ms per loop
So, even if you correctly update your array ( ZC[:]= A+B
) you will not win a lot in your case (~0.5%).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.