[英]Azure speech-to-text ignores numbers
I'm using azure speech to text to find timestamps of utterances in a wav file.我正在使用 azure 语音转文本来查找 wav 文件中话语的时间戳。
The problem I'm encountering is that if the user has recorded numbers, for instance "I'm going to count to three. One, two, three, here I come".我遇到的问题是,如果用户记录了数字,例如“我要数到三。一、二、三,我来了”。 The numbers are omitted from the output. This happens both for English and other languages. output 中省略了数字。英语和其他语言都会出现这种情况。 I can understand utterances like 'eh' and 'ah' being omitted, but numbers?我可以理解省略“eh”和“ah”之类的话语,但是数字? why is that the default.为什么这是默认值。
I'm using:我正在使用:
Can I somehow configure the SpeechRecognizer differently so it also outputs numbers?我可以以某种方式配置 SpeechRecognizer 以使其也输出数字吗?
.wav
audio file to text without the loss of data.因此,使用以下代码我能够将.wav
音频文件转换为文本而不会丢失数据。 string speechKey = "<Your_Key>";
string speechRegion = "Your_Region";
var speechConfig = SpeechConfig.FromSubscription(speechKey, speechRegion);
speechConfig.SpeechRecognitionLanguage = "en-US";
using var audioConfig = AudioConfig.FromWavFileInput("<Path to File>");
using var speechRecognizer = new SpeechRecognizer(speechConfig, audioConfig);
var speechRecognitionResult = await speechRecognizer.RecognizeOnceAsync();
Console.WriteLine(speechRecognitionResult.Text);
output: output:
But apparently there is a bug in the conversion model where if there is a pause between I'm going to count to three.
但显然在转换 model 中存在一个错误,如果中间有停顿, I'm going to count to three.
and One, two, three, here I come
. One, two, three, here I come
。 The model will omit the One, two, three, here I come
sentence from the audio file. model 将省略音频文件中的One, two, three, here I come
句话。
Also, I couldn't find anything in this MSDOC on audio config class to configure the audio settings regarding this issue.此外,我在音频配置 class 的MSDOC中找不到任何内容来配置有关此问题的音频设置。
I found the error my results not recognizing numbers.我发现错误是我的结果无法识别数字。 It was in my own code.它在我自己的代码中。 In my postprocessing I was trying to get rid of punctuation marks from the result.在我的后处理中,我试图从结果中去除标点符号。 Here I was also accidently getting rid of numbers.在这里我也不小心去掉了数字。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.