阈值numpy数组的最快方法是什么？

Question

I want the resulting array as a binary yes/no. 我想将结果数组作为二进制是/否。

I came up with 我想出来了

    img = PIL.Image.open(filename)

    array = numpy.array(img)
    thresholded_array = numpy.copy(array)

    brightest = numpy.amax(array)
    threshold = brightest/2

    for b in xrange(490):
        for c in xrange(490):
            if array[b][c] > threshold:
                thresholded_array[b][c] = 255
            else:
                thresholded_array[b][c] = 0

    out=PIL.Image.fromarray(thresholded_array)

but iterating over the array one value at a time is very very slow and I know there must be a faster way, what's the fastest? 但是一次迭代数组一个值非常慢，我知道必须有一个更快的方法，最快的是什么？

Answer 1

Instead of looping, you can compare the entire array at once in several ways. 您可以通过多种方式一次比较整个阵列，而不是循环。 Starting from 从...开始

>>> arr = np.random.randint(0, 255, (3,3))
>>> brightest = arr.max()
>>> threshold = brightest // 2
>>> arr
array([[214, 151, 216],
       [206,  10, 162],
       [176,  99, 229]])
>>> brightest
229
>>> threshold
114

Method #1: use np.where : 方法＃1：使用np.where ：

>>> np.where(arr > threshold, 255, 0)
array([[255, 255, 255],
       [255,   0, 255],
       [255,   0, 255]])

Method #2: use boolean indexing to create a new array 方法＃2：使用布尔索引来创建一个新数组

>>> up = arr > threshold
>>> new_arr = np.zeros_like(arr)
>>> new_arr[up] = 255

Method #3: do the same, but use an arithmetic hack 方法＃3：做同样的事，但使用算术黑客

>>> (arr > threshold) * 255
array([[255, 255, 255],
       [255,   0, 255],
       [255,   0, 255]])

which works because False == 0 and True == 1 . 这是有效的，因为False == 0和True == 1 。

For a 1000x1000 array, it looks like the arithmetic hack is fastest for me, but to be honest I'd use np.where because I think it's clearest: 对于1000x1000阵列，看起来算术黑客对我来说速度最快，但老实说我会使用np.where因为我觉得它最清楚：

>>> %timeit np.where(arr > threshold, 255, 0)
100 loops, best of 3: 12.3 ms per loop
>>> %timeit up = arr > threshold; new_arr = np.zeros_like(arr); new_arr[up] = 255;
100 loops, best of 3: 14.2 ms per loop
>>> %timeit (arr > threshold) * 255
100 loops, best of 3: 6.05 ms per loop

Answer 2

I'm not sure if your tresholding operation is special, eg need to customize it for every pixel or something, but you can just use logical operation on a np.arrays. 我不确定你的阈值操作是否特殊，例如需要为每个像素或其他东西定制它，但你可以在np.arrays上使用逻辑运算。 For example: 例如：

import numpy as np


a = np.round(np.random.rand(5,5)*255)

thresholded_array = a > 100; #<-- tresholding on 100 value

print(a)
print(thresholded_array)

Gives: 得到：

[[ 238.  201.  165.  111.  127.]
 [ 188.   55.  157.  121.  129.]
 [ 220.  127.  231.   75.   23.]
 [  76.   67.   75.  141.   96.]
 [ 228.   94.  172.   26.  195.]]

[[ True  True  True  True  True]
 [ True False  True  True  True]
 [ True  True  True False False]
 [False False False  True False]
 [ True False  True False  True]]

阈值numpy数组的最快方法是什么？

问题描述

2 个解决方案

解决方案1
7 已采纳 2015-06-26 05:03:49

解决方案2
2 2015-06-26 05:00:27

阈值numpy数组的最快方法是什么？

问题描述

2 个解决方案

解决方案1 7 已采纳 2015-06-26 05:03:49

解决方案2 2 2015-06-26 05:00:27

解决方案1
7 已采纳 2015-06-26 05:03:49

解决方案2
2 2015-06-26 05:00:27