简体繁体 English

帮助使用FFT确定音频样本的频率

[英]Help with using FFT to determine frequency of an audio sample

原文 2010-12-20 14:42:29 0 2 java/ signal-processing/ apache-commons/ fft/ frequency-analysis

I'm currently developing a percussion tutorial program. 我目前正在开发打击乐教程程序。 The program requires that I can determine what drum is being played, to do this I was going to analyse the frequency of the drum recording and see if the frequency is within a given range. 该程序要求我可以确定正在播放的鼓，为此，我将分析鼓录制的频率，并查看该频率是否在给定范围内。

I have been using the Apache math commons implementation for FFT so far (http://commons.apache.org/math/) but my question is, once I preform the FFT, how do I use the array of results to calculate the frequencies contained in the signal? 到目前为止，我一直在使用Apache数学通用实现进行FFT（http://commons.apache.org/math/），但是我的问题是，一旦执行FFT，如何使用结果数组来计算频率包含在信号中？

Note: I have also tried experimenting with using Autocorrelation, but it didn't seem to work to well with sample from a drum kit 注意：我也尝试过尝试使用自相关，但似乎无法很好地处理鼓组中的样品

Any help or alternative suggestions of how to determine what drum is being hit would be greatly appreciated 对于如何确定击打哪个鼓的任何帮助或替代建议，将不胜感激

Edit: Since writing this I've found a great online lesson on implementing FFT in java for Time/ frequency transformations Spectrum Analysis in Java 编辑：自编写此书以来，我发现了一个很棒的在线课程，内容涉及在Java中实现FFT以便进行时间/频率转换的Java频谱分析

2 个解决方案

In the area of music information retrieval, people often use a related metric known as the mel-frequency cepstral coefficients (MFCCs). 在音乐信息检索领域，人们经常使用一种相关的度量标准，即梅尔频率倒谱系数 （MFCC）。

For any N-sample segment of your signal, take the FFT. 对于信号的任何N个样本段，请进行FFT。 Those resulting N samples are transformed into a set of MFCCs containing, say, 12 elements (ie, coefficients). 将那些得到的N个样本转换为一组包含12个元素（即系数）的MFCC。 This 12-element vector is used to classify the instrument, including which drum is used. 该12元素向量用于对乐器进行分类，包括使用哪个鼓。

To do supervised classification, you can use something like a support vector machine (SVM). 要进行监督分类，您可以使用诸如支持向量机 （SVM）之类的东西。 LIBSVM is a commonly used library that has Java compatibility (and many other languages). LIBSVM是具有Java兼容性（和许多其他语言）的常用库。 You train the SVM with these MFCCs and their corresponding instrument labels. 您可以使用这些MFCC及其相应的仪器标签来训练SVM。 Then, you test it by feeding a query MFCC vector, and it will tell you which instrument it is. 然后，您通过提供查询MFCC向量来对其进行测试，它将告诉您它是哪种仪器。

So the basic procedure, in summary: 因此，基本程序总结如下：

Get FFT. 获取FFT。
Get MFCCs from FFT. 从FFT获取MFCC。
Train SVM with MFCCs and instrument labels. 使用MFCC和仪器标签训练SVM。
Query the SVM with MFCCs of the query signal. 使用查询信号的MFCC查询SVM。

Check for Java packages that do these things. 检查执行这些操作的Java软件包。 (They must exist. I just don't know them.) Relatively, drum transcription is easier than most other instrument groups, so I am optimistic that this would work. （它们必须存在。我只是不了解它们。）相对而言，鼓转录比大多数其他乐器组更容易，因此我乐观地认为这会奏效。

For further reading, there are a whole bunch of articles on drum transcription . 为了进一步阅读，有很多关于鼓转录的文章。

When I made a program using a DFT, I had it create an array of Frequencies and Amplitudes for each frequency. 当我使用DFT制作程序时，我为每个频率创建了一个频率和幅度数组。 I could then find the largest amplitudes, and compare those to musical notes, getting a good grasp on what was played. 然后，我可以找到最大的振幅，并将其与音符进行比较，从而很好地了解演奏的内容。 If you know the approximate frequency of the drum, you should be able to do that. 如果您知道感光鼓的大概频率，则应该能够做到。