简体   繁体   English

与实空间卷积相比,FFT 卷积的缺点是什么?

[英]What are the downsides of convolution by FFT compared to realspace convolution?

So I am aware that a convolution by FFT has a lower computational complexity than a convolution in real space.所以我知道 FFT 的卷积比真实空间中的卷积具有更低的计算复杂度。 But what are the downsides of an FFT convolution?但是 FFT 卷积的缺点是什么?

Does the kernel size always have to match the image size, or are there functions that take care of this, for example in pythons numpy and scipy packages?内核大小是否始终必须与图像大小匹配,或者是否有处理此问题的函数,例如在 pythons numpy 和 scipy 包中? And what about anti-aliasing effects?那么抗锯齿效果呢?

FFT convolutions are based on the convolution theorem , which states that given two functions f and g , if Fd() and Fi() denote the direct and inverse Fourier transform, and * and . FFT 卷积基于卷积定理,该定理指出给定两个函数fg ,如果Fd()Fi()表示直接和逆傅立叶变换,以及*. convolution and multiplication, then:卷积和乘法,然后:

f*g = Fi(Fd(d).Fd(g))

To apply this to a signal f and a kernel g , there are some things you need to take care of:要将其应用于信号f和内核g ,您需要注意以下事项:

  • f and g have to be of the same size for the multiplication step to be possible, so you need to zero-pad the kernel (or input, if the kernel is longer than it). fg必须具有相同的大小才能使乘法步骤成为可能,因此您需要对内核(或输入,如果内核比它长)进行零填充。
  • When doing a DFT, which is what FFT does, the resulting frequency domain representation of the function is periodic.在进行 DFT(即 FFT 所做的)时,函数的结果频域表示是周期性的。 This means that, by default, your kernel wraps around the edge when doing the convolution.这意味着,默认情况下,您的内核在进行卷积时会环绕边缘。 If you want this, then all is great.如果你想要这个,那么一切都很好。 But if not, you have to add an extra zero-padding the size of the kernel to avoid it.但如果没有,您必须添加额外的零填充内核大小以避免它。
  • Most (all?) FFT packages only work well (performance-wise) with sizes that do not have any large prime factors.大多数(全部?)FFT 包仅适用于没有任何大素数因子的大小(性能方面)。 Rounding the signal and kernel size up to the next power of two is a common practice that may result in a (very) significant speed-up.将信号和内核大小四舍五入到 2 的下一个幂是一种常见做法,可能会导致(非常)显着的加速。

If your signal and kernel sizes are f_l and g_l , doing a straightforward convolution in time domain requires g_l * (f_l - g_l + 1) multiplications and (g_l - 1) * (f_l - g_l + 1) additions.如果您的信号和内核大小是f_lg_l ,则在时域中进行简单的卷积需要g_l * (f_l - g_l + 1)乘法和(g_l - 1) * (f_l - g_l + 1)加法。

For the FFT approach, you have to do 3 FFTs of size at least f_l + g_l , as well as f_l + g_l multiplications.对于 FFT 方法,您必须执行 3 个大小至少为f_l + g_l FFT,以及f_l + g_l乘法。

For large sizes of both f and g , the FFT is clearly superior with its n*log(n) complexity.对于fg大尺寸,FFT 的n*log(n)复杂度显然更胜一筹。 For small kernels, the direct approach may be faster.对于小内核,直接方法可能更快。

scipy.signal has both convolve and fftconvolve methods for you to play around. scipy.signal既有convolvefftconvolve方法供你玩。 And fftconvolve handles all the padding described above transparently for you.并且fftconvolve为您透明地处理上述所有填充。

While fast convolution has better "big O" complexity than direct form convolution;而快速卷积比直接形式卷积具有更好的“大O”复杂度; there are a few drawbacks or caveats.有一些缺点或警告。 I did some thinking about this topic for an article I wrote a while back.我在不久前写 的一篇文章中对这个主题进行 一些思考。

  1. Better "big O" complexity is not always better.更好的“大 O”复杂性并不总是更好。 Direct form convolution can be faster than using FFTs for filters smaller than a certain size.对于小于特定大小的滤波器,直接形式卷积比使用 FFT 更快。 The exact size depends on the platform and implementations used.确切的大小取决于所使用的平台和实现。 The crossover point is usually in the 10-40 coefficient range.交叉点通常在 10-40 系数范围内。

  2. Latency.潜伏。 Fast convolution is inherently a blockwise algorithm.快速卷积本质上是一种分块算法。 Queueing up hundreds or thousands of samples at a time before transforming them may be unacceptable for some real-time applications.对于某些实时应用程序,在转换之前一次排队数百或数千个样本可能是不可接受的。

  3. Implementation complexity.实现的复杂性。 Direct form is simpler in terms of the memory, code space and in the theoretical background of the writer/maintainer.直接形式在内存、代码空间和作者/维护者的理论背景方面更简单。

  4. On a fixed point DSP platform (not a general purpose CPU): the limited word size considerations of fixed-point FFT make large fixed point FFTs nearly useless.在定点 DSP 平台(不是通用 CPU)上:定点 FFT 的有限字长考虑使大型定点 FFT 几乎无用。 At the other end of the size spectrum, these chips have specialized MAC intstructions that are well designed for performing direct form FIR computation, increasing the range over which te O(N^2) direct form is faster than O(NlogN).在尺寸范围的另一端,这些芯片具有专门为执行直接形式 FIR 计算而精心设计的 MAC 指令,增加了 te O(N^2) 直接形式比 O(NlogN) 快的范围。 These factors tend to create a limited "sweet spot" where fixed point FFTs are useful for Fast Convolution.这些因素往往会产生一个有限的“甜蜜点”,其中定点 FFT 对快速卷积很有用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM