简体   繁体   English

使用Python / NumPy对图像阈值进行矢量化

[英]Vectorizing image thresholding with Python/NumPy

I've been trying to find a more efficient way to iterate through an image and split their properties on a threshold. 我一直在尝试找到一种更有效的方法来遍历图像并在阈值上拆分其属性。 In searching online and discussing with some programming friends they introduced me to the concept of vectorizing (particularly using numpy) a function. 在网上搜索并与一些编程朋友讨论时,他们向我介绍了矢量化(特别是使用numpy)函数的概念。 After much searching and trial and error, I can't seem to get the hang of it. 经过大量的搜索和反复试验后,我似乎无法理解它。 Can some one give me a link, or suggestion how to make the following code more efficient? 可以给我一个链接,或建议如何使以下代码更高效吗?

Im = plt.imread(img)
Imarray = np.array(Im)
for line in Imarray:
    for pixel in line:
        if pixel <= 20000:
            dim_sum += pixel
            dim_counter += 1
        if pixel > 20000:
            bright_sum += pixel
            bright_counter += 1
bright_mean = bright_sum/bright_counter
dim_mean = dim_sum/dim_counter

Basically, each pixel holds a brightness amount between 0 and 30000 and I'm trying to average all pixels below 20000 and above 20000 respectively. 基本上,每个像素的亮度量在0到30000之间,而我试图分别平均所有低于20000和高于20000的像素。 The best way I know how to do this is using for loops (which are slow in python) and search through each pixel with if statements. 我知道如何做到这一点的最好方法是使用for循环(在python中很慢)并使用if语句搜索每个像素。

NumPy supports and encourages vectorization through its arrays and ufuncs . NumPy通过其arraysufuncs支持并鼓励矢量化。 In your case, you have as input image a NumPy array. 在您的情况下,您有一个NumPy数组作为输入图像。 So, those comparisons could be done in one-go/ vectorized manner to give us boolean arrays of the same shape as the input array. 因此,这些比较可以单向/矢量化的方式进行,以提供形状与输入数组相同的布尔数组。 Those boolean arrays when used for indexing into the input arrays would select the valid elements from it. 这些布尔数组用于索引输入数组时,将从其中选择有效元素。 This is called boolean-indexing and forms a key feature in such a vectorized selection. 这称为boolean-indexing并在这种向量化选择中形成关键特征。

Finally, we use NumPy ufunc ndarray.mean that again operates in a vectorized fashion to give us the mean values of the selected elements. 最后,我们使用NumPy ndarray.mean再次以矢量化方式运行,以向我们提供所选元素的平均值。

Thus, to put all those into code, we would have - 因此,要将所有这些内容放入代码中,我们将-

bright_mean, dim_mean = Im[Im > 20000].mean(), Im[Im <= 20000].mean()

For this particular problem, from code-efficiency point of view, it would make more sense to perform the comparison once. 对于此特定问题,从代码效率的角度来看,一次执行比较会更有意义。 The comparison would give us a boolean array, which could be used twice later on, once as it is and second time being inverted. 比较将为我们提供一个布尔数组,该数组可以在以后使用两次,一次就这样,第二次反转。 Thus, alternatively we would have - 因此,我们可以选择-

mask = Im > 20000
bright_mean, dim_mean = Im[mask].mean(), Im[~mask].mean()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM