简体   繁体   English

Azure 语音转文本忽略数字

[英]Azure speech-to-text ignores numbers

I'm using azure speech to text to find timestamps of utterances in a wav file.我正在使用 azure 语音转文本来查找 wav 文件中话语的时间戳。

The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come".我遇到的问题是,如果用户记录了数字,例如“我要数到三。一、二、三,我来了”。 The numbers are omitted from the output. This happens both for English and other languages. output 中省略了数字。英语和其他语言都会出现这种情况。 I can understand utterances like 'eh' and 'ah' being omitted, but numbers?我可以理解省略“eh”和“ah”之类的话语,但是数字? why is that the default.为什么这是默认值。

I'm using:我正在使用:

  • speechConfig.OutputFormat = OutputFormat.Detailed; speechConfig.OutputFormat = OutputFormat.Detailed;
  • the default language model.默认语言 model。

Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?我可以以某种方式配置 SpeechRecognizer 以使其也输出数字吗?

  • So, using the following code I was able to convert a .wav audio file to text without the loss of data.因此,使用以下代码我能够将.wav音频文件转换为文本而不会丢失数据。
 string speechKey = "<Your_Key>";
 string speechRegion = "Your_Region";
 
 var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
        
speechConfig.SpeechRecognitionLanguage = "en-US";

using var audioConfig = AudioConfig.FromWavFileInput("<Path to File>");

using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);

        
var speechRecognitionResult = await speechRecognizer.RecognizeOnceAsync();
       
Console.WriteLine(speechRecognitionResult.Text);

output: output:在此处输入图像描述

  • But apparently there is a bug in the conversion model where if there is a pause between I'm going to count to three.但显然在转换 model 中存在一个错误,如果中间有停顿, I'm going to count to three. and One, two, three, here I come . One, two, three, here I come The model will omit the One, two, three, here I come sentence from the audio file. model 将省略音频文件中的One, two, three, here I come句话。

  • Also, I couldn't find anything in this MSDOC on audio config class to configure the audio settings regarding this issue.此外,我在音频配置 class 的MSDOC中找不到任何内容来配置有关此问题的音频设置。

I found the error my results not recognizing numbers.我发现错误是我的结果无法识别数字。 It was in my own code.它在我自己的代码中。 In my postprocessing I was trying to get rid of punctuation marks from the result.在我的后处理中,我试图从结果中去除标点符号。 Here I was also accidently getting rid of numbers.在这里我也不小心去掉了数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM