Vectorizing image thresholding with Python/NumPy

Question

I've been trying to find a more efficient way to iterate through an image and split their properties on a threshold. In searching online and discussing with some programming friends they introduced me to the concept of vectorizing (particularly using numpy) a function. After much searching and trial and error, I can't seem to get the hang of it. Can some one give me a link, or suggestion how to make the following code more efficient?

Im = plt.imread(img)
Imarray = np.array(Im)
for line in Imarray:
    for pixel in line:
        if pixel <= 20000:
            dim_sum += pixel
            dim_counter += 1
        if pixel > 20000:
            bright_sum += pixel
            bright_counter += 1
bright_mean = bright_sum/bright_counter
dim_mean = dim_sum/dim_counter

Basically, each pixel holds a brightness amount between 0 and 30000 and I'm trying to average all pixels below 20000 and above 20000 respectively. The best way I know how to do this is using for loops (which are slow in python) and search through each pixel with if statements.

Answer 1

NumPy supports and encourages vectorization through its arrays and ufuncs . In your case, you have as input image a NumPy array. So, those comparisons could be done in one-go/ vectorized manner to give us boolean arrays of the same shape as the input array. Those boolean arrays when used for indexing into the input arrays would select the valid elements from it. This is called boolean-indexing and forms a key feature in such a vectorized selection.

Finally, we use NumPy ufunc ndarray.mean that again operates in a vectorized fashion to give us the mean values of the selected elements.

Thus, to put all those into code, we would have -

bright_mean, dim_mean = Im[Im > 20000].mean(), Im[Im <= 20000].mean()

For this particular problem, from code-efficiency point of view, it would make more sense to perform the comparison once. The comparison would give us a boolean array, which could be used twice later on, once as it is and second time being inverted. Thus, alternatively we would have -

mask = Im > 20000
bright_mean, dim_mean = Im[mask].mean(), Im[~mask].mean()

Vectorizing image thresholding with Python/NumPy

Question

1 answers

solution1
6 ACCPTED 2016-07-08 17:35:44

Vectorizing image thresholding with Python/NumPy

Question

1 answers

solution1 6 ACCPTED 2016-07-08 17:35:44

solution1
6 ACCPTED 2016-07-08 17:35:44