简体繁体中英

How to add grammar/hints to microsoft-cognitiveservices-speech-sdk?

原文 2020-07-07 07:46:14 1 1 javascript/ speech-recognition/ azure-cognitive-services

I have a basic setup with the Javascript library of microsoft-cognitiveservices-speech-sdk. I use the browser implementation, not the node implementation. Overall it works fine, yet some issues do occur in which the transcription is a bit off.

Background

The project I am working on is a web application and it uses speech recognition. The user interacts with the application with business codes like A6, B12, ...

I use webkitSpeechRecognition whenever possible, in any other case I provide a fallback with microsoft-cognitiveservices-speech-sdk, which the majority of times works very well.

Issue

The business codes are not always correctly transcribed on microsoft-cognitiveservices-speech-sdk. webkitSpeechRecognition does a better job with this.

Example (in French):

User > A20 (prononcé "a vingt")
STT > Avant
Expected: A20

This might seem close but it isn't, webkitSpeechRecognition is able to solve this one correctly. In the documentation, it seems that one can provide a dynamic grammar and suggestions/hints in order to help the STT. Yet I wasn't able to find an example or a way to use this interface. I was wondering if some of might have a lead for this.

To elaborate this a bit more, I was thinking of providing a IDynamicGrammar object, but I don't know if this is the correct approach nor do I know how to provide this.

Side note

I can use a sort of mechanism like ElasticSearch to find the correct correspondence, yet this only takes me so far. I would really like to optimise the STT.
I cannot force all the users to use Chrome
I cannot change the business codes

1 answers

Reading through the article: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-phrase-lists?pivots=programming-language-javascript

The phrase list is currently applicable only to the English language.

Alternatively, you could train/customize your own model.

The below article details the same:

Please note the pronunciation mapping/hints in the Azure Speech to Text is currently available only for the English and German language at this point of time.

Reference: https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-custom-speech-test-and-train#related-text-data-for-training

However, I had tried casually with the uttered sentences - mentioned the article here As this did not have any language restriction.

I created the sample sentences as related text, trained the model & deployed model. This had slightly better recognition of the codes/non-grammar words. Sample sentences:

This is A 20 Business

There is going be a B 6 Business Model

B 6 on the other hand is not doing good as a business

Please indicate the C 26 profits.

Out of the Box Speech Recognition:

After Using the custom trained mode for the Speech Recognition:

Having said that, I assume that if we train the model with more data - sentences,audio with labeled text(as this also doesn't have any language restriction). The custom model will serve your requirement.

To consume the custom model in the Java Script you could refer this article:

https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/how-to-specify-source-language?pivots=programming-language-more

Azure SpeechSynthesizer Error HTTP: not supported. Expected HTTPS: NPM `microsoft-cognitiveservices-speech-sdk`

How to access the audio stream recorded by Microsoft Speech SDK

How to insert pause in speech synthesis with grammatical hints

Microsoft Speech Services Javascript SDK Timeout

How to stop Microsoft cognitive-services-speech-sdk-js to stop listening manually?

Web speech API grammar

Using Grammar with Web Speech API

The effect of the grammar in the Web Speech API

Grammar in Google Web Speech API

Microsoft Speech-to-Text SDK JS won't accept a file with a long array of bytes

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Azure SpeechSynthesizer Error HTTP: not supported. Expected HTTPS: NPM `microsoft-cognitiveservices-speech-sdk` How to access the audio stream recorded by Microsoft Speech SDK How to insert pause in speech synthesis with grammatical hints Microsoft Speech Services Javascript SDK Timeout How to stop Microsoft cognitive-services-speech-sdk-js to stop listening manually? Web speech API grammar Using Grammar with Web Speech API The effect of the grammar in the Web Speech API Grammar in Google Web Speech API Microsoft Speech-to-Text SDK JS won't accept a file with a long array of bytes

Related Tags

How to add grammar/hints to microsoft-cognitiveservices-speech-sdk?

Question

Background

Issue

Side note

1 answers

solution1 0 2020-07-07 16:18:19

solution1
0 2020-07-07 16:18:19