Why does numba's parallel=True make this computation 3 times slower?

Question

When doing this:

import numpy as np
from numba import jit

@jit
def doit(A, Q, n):
    for i in range(len(Q)):
        Q[i] = np.sum(A[i:i+n] <= A[i+n])

A = np.random.random(1000*1000)
n = 5000
Q = np.zeros(len(A)-n)    
doit(A, Q, n)

the runtime takes ~ 5.4 seconds on my computer.

I tried to use numba's parallelization feature:

@jit(parallel=True)
def doit(A, Q, n):
    for i in range(len(Q)):
        Q[i] = np.sum(A[i:i+n] <= A[i+n])

and instead, it takes 17 seconds.

Why does numba's parallel=True make this computation 3 times slower instead of faster?

Answer 1

I just found the answer: one character was missing: p range instead of range:

from numba import jit, prange

@jit(parallel=True)
def doit(A, Q, n):
    for i in prange(len(Q)):
       ...

Then it takes 1.8 seconds instead of 5.4 seconds: the parallelization worked.

Why does numba's parallel=True make this computation 3 times slower?

Question

1 answers

solution1
1 ACCPTED 2018-11-06 11:05:20

Why does numba's parallel=True make this computation 3 times slower?

Question

1 answers

solution1 1 ACCPTED 2018-11-06 11:05:20

solution1
1 ACCPTED 2018-11-06 11:05:20