STFT /实时数据滑动FFT

Question

I recently picked up a project where I need to perform a real-time sliding FFT analysis on incoming microphone data. 我最近选择了一个项目，需要对传入的麦克风数据执行实时滑动FFT分析。 The environment I picked to do this in, is OpenGL and Cinder and using C++. 我选择执行此操作的环境是OpenGL和Cinder并使用C ++。

This is my first experience in audio programming and I am a little bit confused. 这是我第一次在音频编程方面的经验，我有些困惑。

This is what I am trying to achieve in my OpenGL application: 这是我试图在OpenGL应用程序中实现的目标：

在此处输入图片说明

So in every frame, there's a part of the incoming data. 因此，在每一帧中，都有一部分传入数据。 In a for-loop (therefore multiple passes) a window of the present data will be consumed and FFT analysis will be performed on it. 在for循环中（因此需要多次通过），当前数据的窗口将被消耗，并对它执行FFT分析。 For next iteration of the for-loop, window will advance "hop-size" through the data and etc. until the end of the data is reached. 对于for循环的下一次迭代，窗口将在数据等中前进“跳跃大小”，直到到达数据末尾。

Now this process must be contiguous. 现在，此过程必须是连续的。 But as you can see in the figure above, as soon as my current app frame ends and when next frame's data comes in, I can't pick up where I left the previous frame (because data is already gone). 但是，如您在上图中所看到的，当我当前的应用程序框架结束时，当下一帧的数据输入时，我无法拾取离开上一帧的位置（因为数据已经消失）。 You can see it in figure where the blue area is in-between two frames. 您可以在图中看到它，其中蓝色区域位于两帧之间。

Now you may say, pick the window-size / hop-size in a way that this never happens but that is impossible since these parameters should left user-configurable in my project. 现在您可能会说，以一种永远不会发生的方式选择window-size / hop-size，但那是不可能的，因为这些参数应该在我的项目中由用户配置。

Suggestions for this kind of processing, oriented towards C++11 is also very welcomed! 也非常欢迎针对C ++ 11的此类处理建议！

Thanks! 谢谢！

Answer 1

Not sure I understand your scenario 100%, but sounds like you may want to use a circular buffer. 不确定我100％理解您的情况，但听起来您可能想使用循环缓冲区。 There is no "standard" circular buffer, but there's one in Boost . 没有“标准”循环缓冲区，但Boost中有一个。

However, you'd need a lock if you plan to do the processing with 2 threads. 但是，如果您打算使用2个线程进行处理，则需要一个锁。 One thread, for example, would wait on the audio input, then take the buffer lock, and copy from the audio buffer to the circular buffer. 例如，一个线程将等待音频输入，然后锁定缓冲区，然后从音频缓冲区复制到循环缓冲区。 The second thread would periodically take the buffer lock and read the next k elements, if there are at least k available in the buffer... 如果该缓冲区中至少有k可用，则第二个线程将定期获取该缓冲区的锁并读取接下来的k元素。

You'd need to adjust the size of the buffer appropriately and make sure you always handle the data faster than the incoming rate to avoid losses in the circular buffer... 您需要适当地调整缓冲区的大小，并确保处理数据的速度始终快于传入速率，以避免循环缓冲区中的数据丢失。

Not sure why you mention that the buffer is lock-free and whether that is a requirement, I'd try the circular buffer with locks first as it seems simpler conceptually, and only go lock-free if you have to, because the data structure could be more complicated in this case (but maybe a "producer-consumer" lock-free queue would work)... 不知道为什么要提到缓冲区是无锁的，以及是否有此要求，我会先尝试使用带锁的循环缓冲区，因为从概念上看它很简单，并且只有在必须时才使用锁，因为数据结构在这种情况下可能会更复杂（但是“生产者－消费者”无锁队列可能会起作用）...

HTH. HTH。

Answer 2

Thanks for posting a graphic--that illustrates the problem nicely. 感谢您发布图形-很好地说明了问题。

All you really need here is a buffer of size (window - 1) where you can store zero or more samples from the "previous" frame for processing in the "next" one. 您真正需要的只是一个大小为(window - 1)的缓冲区，您可以在其中存储零个或多个来自“上一个”帧的样本，以便在“下一个”帧中进行处理。 In C++ this would be: 在C ++中，这将是：

std::vector<Sample> interframeBuffer;
interframeBuffer.reserve(windowSize - 1);

Then when you are within windowSize samples from the end of the current frame, rather than process the samples you store them with interframeBuffer.push_back(sample) . 然后，当您位于当前帧末尾的windowSize样本内时，而不是处理样本，而是使用interframeBuffer.push_back(sample)存储。 When you start processing the next frame, you first do: 当您开始处理下一帧时，首先需要执行以下操作：

for (const Sample& sample : interframeBuffer) {
    process(sample);
}
interframeBuffer.clear();

You should use a single vector the whole time, clearing it and repopulating it as needed, to avoid memory allocation. 您应该一直使用单个向量，清除它并根据需要重新填充它，以避免内存分配。 That's why we call reserve() at the top--to avoid latency later on. 这就是为什么我们在顶部调用reserve()的原因，以避免以后出现延迟。 Calling clear() doesn't release the memory, it just resets the size() to zero. 调用clear()不会释放内存，只是将size()重置为零。

STFT /实时数据滑动FFT

问题描述

2 个解决方案

解决方案1
1 2015-02-09 05:33:19

解决方案2
1 2015-02-10 00:20:02

STFT /实时数据滑动FFT

问题描述

2 个解决方案

解决方案1 1 2015-02-09 05:33:19

解决方案2 1 2015-02-10 00:20:02

解决方案1
1 2015-02-09 05:33:19

解决方案2
1 2015-02-10 00:20:02