简体   繁体   中英

Quality of windows speech recognition (SAPI) from file

I am trying to perform voice recognition on an audio stream input (over UDP). I am using Microsoft speech recognition (SAPI). When I test the speech recognition using my microphone, I get good quality (both in C# and in C++). However, once this information comes from a WAV file (or a memory buffer from my UDP stream), the recognition rate dropped drastically. I tried saving the file in 44100Hz in audacity, and also wrote my own code in C# to write a WAV file. Of course, I use the exact same microphone and the sound sounds good in the file.

Could SAPI be using different models for microphone input and file input? Has anyone encountered this problem (and has any solution)?

Below is my C# code (though I have the exact same problem in C++).

SpeechRecognitionEngine recognizer = new SpeechRecognitionEngine(); Grammar dictationGrammar = new DictationGrammar(); recognizer.LoadGrammar(dictationGrammar); recognizer.SetInputToWaveFile("c:\\path\\to\\file.wav"); RecognitionResult result = recognizer.Recognize(); text1.Text = result.Text;

No idea why, but it helped me a lot to reduce the amplitude of the file by a factor of 10 (used code from Reduce the volume of a Wav audio file using C ).

Maybe, when SAPI listens to the microphone it uses a reduced volume, therefore it needs to be simulated also when loading a wav file.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM