简体   繁体   English

Scipy ndimage形态运算符使我的计算机内存RAM饱和(8GB)

[英]Scipy ndimage morphology operators saturate my computer memory RAM (8GB)

I need to compute morphological opening for 3D array of shape (400,401,401), size 64320400 bytes using a 3D structure element with a radius of 17 or greater. 我需要使用半径为17或更大的3D结构元素计算3D形状阵列(400,401,401),大小64320400字节的形态开口。 The size of structure element ndarray is 42875 bytes. 结构元素ndarray的大小是42875字节。 Using scipy.ndimage.morphology.binary_opening , the whole process consumes 8GB RAM. 使用scipy.ndimage.morphology.binary_opening ,整个过程消耗8GB RAM。

I have read scipy/ndimage/morphology.py on GitHub, and as far as I can tell, the morphology erosion operator is implemented in pure C. It is to difficult for me to understand the ni_morphology.c source, so I haven't found any part of this code which leads to such enormous memory utilization. 我在GitHub上读过scipy/ndimage/morphology.py ,据我所知,形态侵蚀算子是用纯C实现的。我很难理解ni_morphology.c源代码,所以我没有发现此代码的任何部分导致如此巨大的内存利用率。 Adding more RAM is not a workable solution, since memory usage may increase exponentially with the structure element radius. 添加更多RAM不是一个可行的解决方案,因为内存使用可能会随着结构元素半径呈指数增长。

To reproduce the problem: 重现问题:

import numpy as np
from scipy import ndimage

arr_3D = np.ones((400,401,401),dtype="bool")

str_3D = ndimage.morphology.generate_binary_structure(3,1)
big_str_3D = ndimage.morphology.iterate_structure(str_3D,20)

arr_out_3D = ndimage.morphology.binary_opening(arr_3D, big_str_3D)

This takes approximately 7GB RAM. 这需要大约7GB RAM。

Does anyone have some suggestions for how compute morphology in example described above? 有没有人对上述例子中的计算形态学有一些建议?

I too do openings of increasing radius for granulometry, and I ran into this same problem. 我也做了粒度增加半径的开口,我遇到了同样的问题。 In fact, the memory usage increases as roughly R^6 where R is the radius of the spherical kernel. 事实上,内存使用量大致R ^ 6增加,其中R是球形内核的半径。 That's quite a rate of increase! 这是一个相当大的增长率! I did some memory profiling, including splitting the opening into an erosion and then a dilation (the definition of opening), and found that the large memory usage comes from SciPy's binaries and is cleared as soon as the result is returned to the calling Python script. 我做了一些内存分析,包括将开口分解为侵蚀然后扩展(开放的定义),并发现大量内存使用来自SciPy的二进制文件,并在结果返回到调用Python脚本时立即清除。 SciPy's morphology codes are mostly implemented in C, so modifying them is a difficult prospect. SciPy的形态代码主要用C语言实现,因此修改它们是一个很难的前景。

Anyway the OP's last comment: "After some researche I turned to Opening implementation using convolution -> multiplication of Fourier transforms - O(n log n), and no so big memory overhead." 无论如何OP的最后评论: “经过一些研究后,我转向使用卷积的Opening实现 - >傅里叶变换的乘法--O(n log n),并没有那么大的内存开销。” helped me figure out the solution, so thanks for that. 帮我找出解决方案,谢谢你。 The implementation however was not obvious at first. 然而,实施起初并不明显。 For anyone else who happens upon this problem I am going to post the implementation here. 对于遇到此问题的其他任何人,我将在此处发布实现。

I will start talking about dilation, because binary erosion is just the dilation of the complement (inverse) of a binary image, and then the result is inverted. 我将开始谈论扩张,因为二元侵蚀只是二进制图像的补充(逆)的扩张,然后结果被反转。

In short: according to this white paper by Kosheleva et al , dilation can be viewed as a convolution of the dataset A with the structuring element (spherical kernel) B, thresholded above a certain value. 简而言之:根据Kosheleva等人的白皮书 ,扩张可以看作是数据集A与结构元素(球形核)B的卷积,阈值高于某个值。 Convolutions can also be done (often much faster) in frequency space, since a multiplication in frequency space is the same as convolution in real space. 在频率空间中也可以进行卷积(通常更快),因为频率空间中的乘法与实际空间中的卷积相同。 So by taking the Fourier transform of A and B first, multiplying them, and then inverse-transforming the result, and then thresholding that for values above 0.5, you get the dilation of A with B. (Note that the white paper I linked says to threshold above 0, but much testing showed that that gave wrong results with many artifacts; another white paper by Kukal et al . gives the threshold value as >0.5, and that gave identical results as scipy.ndimage.binary_dilation for me. I'm not sure why the discrepancy, and I wonder if I missed some detail of ref 1's nomenclature) 因此,通过采取傅立叶变换A和B的第一,乘以它们,然后逆变换的结果,然后进行阈值对值高于0.5,你会得到一个与B的扩张(注意白皮书我联系说阈值高于0,但是很多测试表明这对于许多工件给出了错误的结果; Kukal等人的另一篇白皮书给出了阈值> 0.5,并给出了与scipy.ndimage.binary_dilation相同的结果。我我不确定为什么会出现差异,我想知道我是否错过了ref 1命名法的一些细节)

Proper implementation of that involves padding for size, but luckily for us, it's already been done in scipy.signal.fftconvolve(A,B,'same') - this function does what I just described and takes care of padding for you. 正确的实现涉及填充大小,但幸运的是,它已经在scipy.signal.fftconvolve(A,B,'same') - 这个函数执行我刚才描述的并为你处理填充。 Giving the third option as 'same' will return a result the same size as A, which is what we want (otherwise it will be padded out by the size of B). 将第三个选项赋予“相同”将返回与A相同大小的结果,这是我们想要的(否则它将以B的大小填充)。

So dilation is: 扩张是:

from scipy.signal import fftconvolve
def dilate(A,B):
    return fftconvolve(A,B,'same')>0.5

Erosion in principal is this: you invert A, dilate it by B as above, and then re-invert the result. 原则上的侵蚀是这样的:你将A反转,用上面的B扩张,然后重新反转结果。 But it requires a slight trick to match exactly the results from scipy.ndimage.binary_erosion - you must pad the inversion with 1s out to at least the radius R of the spherical kernel B. So erosion can be implemented thusly to get identical results to scipy.ndimage.binary_erosion. 但它需要一个小技巧来完全匹配scipy.ndimage.binary_erosion的结果 - 你必须用1s填充反转到至少球形内核B的半径R.因此可以实现侵蚀以获得与scipy相同的结果.ndimage.binary_erosion。 (Note that the code could be done in fewer lines but I'm trying to be illustrative here.) (请注意,代码可以用更少的行完成,但我在这里试图说明。)

from scipy.signal import fftconvolve
import numpy as np
def erode_v1(A,B,R):
    #R should be the radius of the spherical kernel, i.e. half the width of B
    A_inv = np.logical_not(A)
    A_inv = np.pad(A_inv, R, 'constant', constant_values=1)
    tmp = fftconvolve(A_inv, B, 'same') > 0.5
    #now we must un-pad the result, and invert it again
    return np.logical_not(tmp[R:-R, R:-R, R:-R])

You can get identical erosion results another way, as shown in the white paper by Kukal et al - they point out that the convolution of A and B can be made into an erosion by thresholding by > m-0.5 , where m is the "size" of B (which turns out to be the volume of the sphere, not the volume of the array). 你可以用另一种方法获得相同的侵蚀结果,如Kukal等人的白皮书所示 - 他们指出A和B的卷积可以通过阈值化> m-0.5进行侵蚀,其中m是“尺寸” “B(结果是球体的体积,而不是阵列的体积)。 I showed erode_v1 first because it's slightly easier to understand, but the results are the same here: 我首先展示了erode_v1,因为它稍微容易理解,但结果在这里是相同的:

from scipy.signal import fftconvolve
import numpy as np
def erode_v2(A,B):
    thresh = np.count_nonzero(B)-0.5
    return fftconvolve(A,B,'same') > thresh

I hope this helps anyone else having this problem. 我希望这可以帮助其他人解决这个问题。 Notes about the results I got: 关于我得到的结果的说明:

  • I tested this in both 2D and 3D and all results were identical to the same answer gotten by scipy.ndimage morphological operations (as well as the skimage operations, which on the back end just call the ndimage ones). 我在2D和3D中测试了这一点,所有结果都与scipy.ndimage形态操作得到的相同答案相同(以及后端操作,后端只调用ndimage)。
  • For my largest kernels (R=21), the memory usage was 30x less! 对于我最大的内核(R = 21),内存使用量减少了30倍! The speed was also 20x faster. 速度也快了20倍。
  • I only tested it on binary images though - I just don't know about greyscale, but there is some discussion of that in the second reference below. 我只测试了二进制图像 - 我只是不知道灰度,但在下面的第二个参考中有一些讨论。

Two more quick notes: 另外两个快速说明:

First: Consider the padding I discuss in the middle section about erode_v1. 第一:考虑我在中间部分讨论的关于erode_v1的填充。 Padding the inverse out with 1s basically allows erosion to occur from the edges of the dataset as well as from any interface in the dataset. 用1s填充反向输出基本上允许从数据集的边缘以及数据集中的任何界面发生侵蚀。 Depending on your system and what you are trying to do, you may want to consider whether or not this truly represents the way you want it handled. 根据您的系统和您要执行的操作,您可能需要考虑这是否真正代表您希望它的处理方式。 If not, you might consider padding out with the 'reflect' boundary condition, which would simulate a continuation of any features near the edge. 如果没有,您可以考虑使用“反射”边界条件填充,这将模拟边缘附近的任何特征的延续。 I recommend playing around with different boundary conditions (on both dilation and erosion) and visualizing and quantifying the results to determine what suits your system and goals the best. 我建议使用不同的边界条件(扩张和侵蚀),并对结果进行可视化和量化,以确定最适合您的系统和目标的方法。

Second: This frequency-based method is not only better in memory but also in speed - for the most part. 第二:这种基于频率的方法不仅在内存方面更好,而且速度更快 - 大多数情况下。 For small kernels B, the original method is faster. 对于小内核B,原始方法更快。 However, small kernels run very quickly anyway, so for my own purposes I don't care. 但是,无论如何,小内核运行得非常快,所以出于我自己的目的,我并不在乎。 If you do (like if you are doing a small kernel many times), you may want to find the critical size of B and switch methods at that point. 如果你这样做(比如你多次做一个小内核),你可能想要找到B的临界大小并在那时切换方法。

References, though I apologize that they are not easy to cite as they provide neither year: 参考文献,虽然我很抱歉他们不容易引用,因为他们既没有提供年份:

  1. Fast Implementation of Morphological Operations Using Fast Fourier Transform by O. Kosheleva, SD Cabrera, GA Gibson, M. Koshelev. O. Kosheleva,SD Cabrera,GA Gibson,M。Koshelev 使用快速傅里叶变换快速实现形态学操作 http://www.cs.utep.edu/vladik/misha5.pdf http://www.cs.utep.edu/vladik/misha5.pdf
  2. Dilation and Erosion of Gray Images with Spherical Masks by J. Kukal, D. Majerova, A. Prochazka. 球形掩模灰度图像的膨胀和侵蚀作者 :J. Kukal,D。Majerova,A。Prochazka。 http://http%3A%2F%2Fwww2.humusoft.cz%2Fwww%2Fpapers%2Ftcp07%2F001_kukal.pdf HTTP://http%3A%2F%2Fwww2.humusoft.cz%2Fwww%2Fpapers%2Ftcp07%2F001_kukal.pdf

A wild guess would be that the code is trying to decompose the structuring element somehow and doing several parallel computations. 一个疯狂的猜测是代码试图以某种方式分解结构元素并进行多个并行计算。 Each computation with its own copy of the whole original data. 每个计算都有自己的原始数据副本。 400x400x400 is not that big tbh... 400x400x400不是那么大......

AFAIK, since you are doing a single opening/closing, it should use, at most, 3x the memory of the original data: original + dilation/erosion + final result... AFAIK,因为你正在进行一次打开/关闭,它应该最多使用原始数据的3倍内存:原始+扩张/侵蚀+最终结果......

You could try to implement if yourself by hand... it might be slower, but the code is easy enough and should give some insight into the problem... 你可以尝试自己动手实现......它可能会更慢,但代码很简单,应该对问题有所了解......

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 为什么在8GB内存的macOS计算机上使用352GB NumPy ndarray? - Why can a 352GB NumPy ndarray be used on an 8GB memory macOS computer? 形态侵蚀 - Scipy ndimage和Scikit图像之间的差异 - Morphology erosion - difference betwen Scipy ndimage and Scikit image Raspberry Pi 4 - 8gb RAM、64gb SD 卡内存不足试图加载 Tensorflow 模型 - Raspberry Pi 4 - 8gb RAM, 64gb SD Card Running Out of Memory Trying to Load Tensorflow Model 在 8GB RAM 上线性规划 16GB 数据 - Linear Programing of 16GB of data on 8GB of RAM 64位系统,8GB的RAM,超过800MB的CSV并使用python读取会导致内存错误 - 64 bit system, 8gb of ram, a bit more than 800MB of CSV and reading with python gives memory error ndimage不在SciPy中 - ndimage is not in SciPy 使用一个将77gb文件读入内存以扫描关键字的python脚本,代码可以正常工作,但是我的计算机内存不足 - Using a python script which reads a 77gb file into memory to scan for keywords, the code works but my computer doesn't have enough ram scipy 中缺少 ndimage - ndimage missing from scipy 带有 scipy ndimage 的自定义过滤器 - Custom filter with scipy ndimage 在运行CV程序时,Python无法分配超过500MB的内存(在Ubuntu中,在Mac上为8GB RAM Virtualbox) - Python is unable to allocate more than about 500MB while running CV program (in Ubuntu on 8GB RAM Virtualbox on Mac)
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM