简体   繁体   English

是否可以将批处理FFT与CUDA的cuFFT库和cufftPlanMany重叠?

[英]Is it possible to overlap batched FFTs with CUDA's cuFFT library and cufftPlanMany?

I am trying to parallelize the FFT transforms of an acoustic fingerprinting library known as Chromaprint. 我正在尝试并行化称为Chromaprint的声学指纹库的FFT变换。 It works by "splitting the original audio into many overlapping frames and applying the Fourier transform on them." 它的工作原理是“将原始音频分割成许多重叠的帧,然后对它们应用傅立叶变换”。 Chromaprint uses a frame size of 4096, with a 2/3 overlap. 色度打印使用的帧大小为4096,重叠2/3。 For instance, the first frame consists of elements [0...4095], then the second frame is something like [1366.. 5462]. 例如,第一帧包含元素[0 ... 4095],然后第二帧类似于[1366 .. 5462]。

With cufftPlanMany, I know that you can specify batches of size 4096, that will perform batches of [0... 4095], [4096... 8192], etc. Is there some way to make the batched transforms overlap, or should I consider another approach that doesn't use batched execution? 使用cufftPlanMany,我知道您可以指定大小为4096的批处理,这些批处理将执行[0 ... 4095],[4096 ... 8192]等批处理。是否有某种方法可以使批处理的转换重叠,或者应该我考虑另一种不使用批量执行的方法吗?

If you use Advanced Data Layout , the idist parameter should allow you to set any arbitrary offset between the starting points of 2 successive transform input sets. 如果使用Advanced Data Layout ,则idist参数应允许您设置2个连续变换输入集的起点之间的任意偏移。

For the 1D case, the input will be selected according to the following based on the parameters you pass: 对于一维情况,将根据您传递的参数根据以下内容选择输入:

input[ b * idist + x * istride]

(where b is the batch number currently being processed, ie b = 0, 1, 2, ... batch size) (其中b是当前正在处理的批号,即b = 0、1、2,...批大小)

"The idist and odist parameters indicate the distance between the first element of two consecutive batches in the input and output data." “ idist和odist参数指示输入和输出数据中两个连续批次的第一个元素之间的距离。”

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM