简体   繁体   中英

Numpy - Array addition using previous value

this is how I would code something, where "minute" and "errands" are lists of the same size (examples below).

But instead these are two numpy arrays, so this code doesn't work. The other thing being I'd like the result "done" to be a numpy array as well.

done = 0
for i in minute:
    if done < minute:
        done = minute + (errands * 2)
    else:
        done = done + (errands * 2)
    print (done)

So, I have also tried using "np.where"

import numpy as np
done = 0
done = np.where(done < minute, minute + (errands * 2), done + (errands * 2))
print(done)

This would be perfect, but the problem here is that it doesn't continuously update "done" so that at some point the equivalent code of "done = done + (errands * 2)" would run (if that makes sense).

Some small examples of the numpy array:

minute = np.array([2, 2, 5, 5, 6, 7, 9, 11, 15])

errands = np.array([1, 1, 1, 7, 2, 2, 1, 1, 1])

Just so I can be as clear as possible I would like the output of "done" to be

done = np.array([4, 6, 8, 22, 26, 30, 32, 34, 36])

Thanks in advance for your help.

This is an iterative problem because of the updating. However, it is O(n), and can be done efficiently using numba and njit :

Setup

from numba import njit

You may have to pip install numba


@njit
def toggle(a, b):
    done, out = 0, []    
    for i in range(len(a)):
        if done < a[i]:
            done = a[i] + (b[i] * 2)
        else:
            done = done + (b[i] * 2)
        out.append(done)
    return np.array(out)

toggle(minute, errands)

array([ 4,  6,  8, 22, 26, 30, 32, 34, 36], dtype=int64)

Performance

minute = np.repeat(minute, 10000)
errands = np.repeat(errands, 10000)

%timeit toggle(minute, errands)
2.02 ms ± 9.84 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

%timeit toggle_no_njit(minute, errands)
64.4 ms ± 738 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

This can be done with numpy alone:

def smart(m, e):
    e = 2*e
    r = np.empty(e.size+1, e.dtype)
    r[0] = 0
    e.cumsum(out=r[1:])
    return r[1:] + np.maximum.accumulate(m - r[:-1])

Test and timings: Set up random problem of size 1000:

>>> e = np.random.uniform(1, 3, 1000)
>>> m = np.random.uniform(1, 7, 1000).cumsum()

Gives same result as numba:

>>> np.allclose(toggle(m, e), smart(m, e))
True

But considerably faster even when compile time is excluded:

>>> timeit(lambda: toggle(m, e))
21.466296120896004
>>> timeit(lambda: smart(m, e))
11.608282678993419

You could use Numba to accomplish this task very efficiently.

But avoid using lists whenever possible, as @user3483203 did in his answer. Lists come with a very high overhead, since Numba can't directly work on lists 2.6.2.4.1. List Reflection

Example

@nb.njit
def toggle_2(a, b):
    done=0.
    out=np.empty(a.shape[0],dtype=a.dtype)

    for i in range(a.shape[0]):
        if done < a[i]:
            done = a[i] + (b[i] * 2)
        else:
            done = done + (b[i] * 2)
        out[i]=done
    return out

Performance

e = np.random.uniform(1, 3, 1_000)
m = np.random.uniform(1, 7, 1_000).cumsum()

Paul Panzer (smart) : 13.22 µs
user3483203 (toggle): 18.47 µs
toggle_2              2.47  µs

e = np.random.uniform(1, 3, 1_000_000)
m = np.random.uniform(1, 7, 1_000_000).cumsum()

Paul Panzer (smart) : 15.97 ms
user3483203 (toggle): 30.28 ms
toggle_2              3.77  ms

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM