简体繁体 English

在C＃中使用复数计算fft

[英]calculating fft with complex number in c#

原文 2013-03-05 13:15:22 2 3 c#/ speech-recognition/ naudio/ speech-to-text/ audio-processing

I use this formula to get frequency of a signal but I dont understand how to implement code with complex number? 我使用此公式来获取信号的频率，但我不明白如何实现具有复数的代码？ There is "i" in formula that relates Math.Sqrt(-1). 与Math.Sqrt（-1）相关的公式中存在“ i”。 How can I code this formula to signal in C# with NAduio library? 如何使用NAduio库将此公式编码为C＃信号？

在此处输入图片说明

3 个解决方案

A lot of languages actually provide Libraries for this that come, built in. One example, in C#.NET, is at this link . 实际上，很多语言都为此提供了内置的库。此链接是C＃.NET中的一个示例。 This gives you a step by step guide to how to set up a speech recognition program. 这为您提供了如何设置语音识别程序的逐步指南。 It also abstracts you away from the low level detail of parsing audio for certain phenomes etc (which frankly is pointless with the amount of libraries there are about, unless you wish to write a highly optimized version). 它还使您摆脱了解析某些现象等音频的低级细节（坦白地说，对于其中存在的库数量没有意义，除非您希望编写高度优化的版本）。

If you want to go back to a basic level then: 如果您想回到基本水平，则：

You'll want to use some form of probabilistic model, something like a hidden Markov model (HMM). 您将要使用某种形式的概率模型，例如隐马尔可夫模型（HMM）。 This will allow you to test what the user says to a collection of models, one for each word they are allowed to say. 这将允许您测试用户对一组模型说的内容，即允许每个用户说的单词。

Additionally you want to transform the audio waveform into something that your program can more easily interpret. 此外，您还希望将音频波形转换为程序可以更容易解释的格式。 Something like a fast Fourier transform (FFT) or a wavelet transform (CWT). 诸如快速傅立叶变换（FFT）或小波变换（CWT）之类的东西。

The steps would be: 步骤将是：

Get audio 取得音讯
Remove background noise 消除背景噪音
Transform via FFT or CWT 通过FFT或CWT进行转换
Detect peaks and other features of the audio 检测峰值和音频的其他功能
Compare these features with your HMMs 将这些功能与HMM进行比较
Pick the HMM with the best result about a threshold. 选择具有约阈值的最佳结果的HMM。

Of course this requires you to previously train the HMMs with the correct words. 当然，这需要您事先使用正确的单词来训练HMM。

It is a difficult problem nonetheless and you will have to use a ASR framework to do it. 但是，这是一个棘手的问题，您将必须使用ASR框架来实现。 I have done something slightly more complex (~100 words) using Sphinx4. 我使用Sphinx4做过一些复杂的事情（〜100个单词）。 You can also use HTK. 您也可以使用HTK。

In general what you have to do is: 通常，您需要做的是：

write down all the words that you want to recognize 写下您想识别的所有单词
determine the syntax of your commands like (direction) (amount) 确定命令的语法，例如（方向）（数量）

Then choose a framework, get an acoustic model, generate a dictionary and a language model compatible with that framework. 然后选择一个框架，获取声学模型，生成与该框架兼容的字典和语言模型。 Then integrate the framework into your application. 然后将框架集成到您的应用程序中。

I hope I have mentioned all important things you need to do. 我希望我已经提到了您需要做的所有重要事情。 You can google them separately or go to your chosen framework's tutorial. 您可以单独搜索它们，也可以转到所选框架的教程。

Your task is relatively simple in terms of speech recognition and you should get good results if you complete it. 就语音识别而言，您的任务相对简单，如果完成任务，您将获得良好的结果。