简体   繁体   中英

Speak Rate Microsoft Bing Speech API - Text to Speech

I have followed the sample application to generate speech from text using below GitHub repository.

https://github.com/Azure-Samples/Cognitive-Speech-TTS/tree/master/Samples-Http/CSharp

My application is running fine only problem is speak rate or break/pause after each word.

Input text: yu 7 fsd 2 3 e

Following is sample SSML I am using:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-IN"><voice xml:lang="en-IN" name="Microsoft Server Speech Text to Speech Voice (en-IN, Ravi, Apollo)">yu 7 fsd 2 3 e</voice></speak>

I want to pause after every alphabet. As I am using this audio to get captcha text in audio mode.

Please suggest a correct approach.

PS: I don't want to repeat whole code by copy paste. (using sample from GIT)

I have even followed the conversation in comments from a link below with no luck.

https://docs.microsoft.com/en-us/azure/cognitive-services/speech/home

this -> "y". "u". "7". "f". "s". "d". "2". "3". "e". <- it works on the bing speech web page test. It should be fine for you as well. here is the SSML:

<speak version="1.0" xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xml:lang="en-US">
    <voice xml:lang="en-US" name="Microsoft Server Speech Text to Speech Voice (en-US, ZiraRUS)">&quot;y&quot;. &quot;u&quot;. &quot;7&quot;. &quot;f&quot;. &quot;s&quot;. &quot;d&quot;. &quot;2&quot;. &quot;3&quot;. &quot;e&quot;.
    </voice>
</speak>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM