简体   繁体   中英

Azure speech-to-text ignores numbers

I'm using azure speech to text to find timestamps of utterances in a wav file.

The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come". The numbers are omitted from the output. This happens both for English and other languages. I can understand utterances like 'eh' and 'ah' being omitted, but numbers? why is that the default.

I'm using:

  • speechConfig.OutputFormat = OutputFormat.Detailed;
  • the default language model.

Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?

  • So, using the following code I was able to convert a .wav audio file to text without the loss of data.
 string speechKey = "<Your_Key>";
 string speechRegion = "Your_Region";
 
 var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
        
speechConfig.SpeechRecognitionLanguage = "en-US";

using var audioConfig = AudioConfig.FromWavFileInput("<Path to File>");

using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);

        
var speechRecognitionResult = await speechRecognizer.RecognizeOnceAsync();
       
Console.WriteLine(speechRecognitionResult.Text);

output: 在此处输入图像描述

  • But apparently there is a bug in the conversion model where if there is a pause between I'm going to count to three. and One, two, three, here I come . The model will omit the One, two, three, here I come sentence from the audio file.

  • Also, I couldn't find anything in this MSDOC on audio config class to configure the audio settings regarding this issue.

I found the error my results not recognizing numbers. It was in my own code. In my postprocessing I was trying to get rid of punctuation marks from the result. Here I was also accidently getting rid of numbers.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM