Azure 语音转文本忽略数字

Question

I'm using azure speech to text to find timestamps of utterances in a wav file.我正在使用 azure 语音转文本来查找 wav 文件中话语的时间戳。

The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come".我遇到的问题是，如果用户记录了数字，例如“我要数到三。一、二、三，我来了”。 The numbers are omitted from the output. This happens both for English and other languages. output 中省略了数字。英语和其他语言都会出现这种情况。 I can understand utterances like 'eh' and 'ah' being omitted, but numbers?我可以理解省略“eh”和“ah”之类的话语，但是数字？ why is that the default.为什么这是默认值。

I'm using:我正在使用：

speechConfig.OutputFormat = OutputFormat.Detailed; speechConfig.OutputFormat = OutputFormat.Detailed;
the default language model.默认语言 model。

Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?我可以以某种方式配置 SpeechRecognizer 以使其也输出数字吗？

Answer 1

So, using the following code I was able to convert a .wav audio file to text without the loss of data.因此，使用以下代码我能够将.wav音频文件转换为文本而不会丢失数据。

 string speechKey = "<Your_Key>";
 string speechRegion = "Your_Region";
 
 var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
        
speechConfig.SpeechRecognitionLanguage = "en-US";

using var audioConfig = AudioConfig.FromWavFileInput("<Path to File>");

using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);

        
var speechRecognitionResult = await speechRecognizer.RecognizeOnceAsync();
       
Console.WriteLine(speechRecognitionResult.Text);

output: output： 在此处输入图像描述

But apparently there is a bug in the conversion model where if there is a pause between I'm going to count to three.但显然在转换 model 中存在一个错误，如果中间有停顿， I'm going to count to three. and One, two, three, here I come . One, two, three, here I come 。 The model will omit the One, two, three, here I come sentence from the audio file. model 将省略音频文件中的One, two, three, here I come句话。
Also, I couldn't find anything in this MSDOC on audio config class to configure the audio settings regarding this issue.此外，我在音频配置 class 的MSDOC中找不到任何内容来配置有关此问题的音频设置。

Answer 2

I found the error my results not recognizing numbers.我发现错误是我的结果无法识别数字。 It was in my own code.它在我自己的代码中。 In my postprocessing I was trying to get rid of punctuation marks from the result.在我的后处理中，我试图从结果中去除标点符号。 Here I was also accidently getting rid of numbers.在这里我也不小心去掉了数字。

Azure 语音转文本忽略数字

问题描述

2 个解决方案

解决方案1
0 2023-01-16 12:58:42

解决方案2
0 已采纳 2023-01-17 14:41:36

Azure 语音转文本忽略数字

问题描述

2 个解决方案

解决方案1 0 2023-01-16 12:58:42

解决方案2 0 已采纳 2023-01-17 14:41:36

解决方案1
0 2023-01-16 12:58:42

解决方案2
0 已采纳 2023-01-17 14:41:36