This question is quite close to my heart as I have been doing something like this for almost 2 years and always wondered if there is a vectorized way of modifying large array/dataframe when row i
depends upon row i-1
, ie, when recursion sounds mandatory. I am very keen to hear if there are either clever algorithms or clever tools (cython, numba, get rid of redundant operations, etc.) to optimize the runtime.
Problem :
I have 3 big numpy arrays: x
, y
and z
of shape (1million by 500), (1million by 500) and (1million by 1). Clip/winsorize each element in any given row i
of x
based on whether |(x[i] - x[i-1]) * z[i] / y[i]| > thresh
|(x[i] - x[i-1]) * z[i] / y[i]| > thresh
. I am doing this in the following way which is taking extremely long time for my simulations to run (esp when this step repeats thousands of time to tune the hyperparameters):
t = x.copy()
t[0] = np.clip(t[0] * z[0]/ y[0], -1 * thresh, thresh) * y[0] / z[0]
for i in range(1, t.shape[0]):
t[i] = np.clip((t[i] - t[i-1]) * z[i] / y[i], -1* thresh, thresh) * y[i] / z[i] + t[i-1]
Sample input :
import numpy as np
import random
x = np.random.rand(1000000, 500)
y = np.random.rand(1000000, 500)
z = np.random.rand(1000000, 1)
thresh = 0.7
Edit : Modified to remove append as suggested by @Mad Physicist and redundant if-else as suggested by @Pedro Maia
I got 3-4x improvement by using numba
, another 2x (in total 6-8x) by caching compiled function.
I had to decrease size of x
, y
, and z
due to small RAM on my PCU.
import numba as nb
import numpy as np
@nb.jit(nopython=True, cache=True)
def bottleneck_func3(t, thresh, y, z):
t[0] = np.clip(t[0] * z[0] / y[0], -1 * thresh, thresh) * y[0] / z[0]
for i in range(1, t.shape[0]):
t[i] = np.clip((t[i] - t[i - 1]) * z[i] / y[i], -1 * thresh, thresh) * y[i] / z[i] + t[i - 1]
x = np.random.rand(300000, 500)
y = np.random.rand(300000, 500)
z = np.random.rand(300000, 1)
thresh = 0.7
t = x.copy()
bottleneck_func3(t, thresh, y, z)
Note that cached function is recompiled each time you modify the file , even parts that have nothing to do with cached function.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.