简体   繁体   中英

SpeechSynthesizer and SSML

I have been trying to work with the prosody pitch attribute but doesn't seem straightforward or seem to work. I want to create a simple "do re mi" following the g-major scale. The results do not turn out as expected using the various Hz values. Sometimes it seems to do what it wants regardless of what I put. Example:

        <prosody pitch="0Hz">A</prosody><break time="100ms" />
        <prosody pitch="+2st">E</prosody><break time="100ms" />
        <prosody pitch="+4st">I</prosody><break time="100ms" />
        <prosody pitch="+6st">O</prosody><break time="100ms" />
        <prosody pitch="+8st">U</prosody><break time="100ms" />

Here is the link to the suggestions for the similar issue for Prosody contour.

Improve synthesis with Speech Synthesis Markup Language (SSML): https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/speech-synthesis-markup?tabs=csharp#adjust-prosody

Looking at all the alternatives, Amazon, Google, etc, they all say that Neural voices do not fully support pitch. I suspect the same with the SpeechSynthesizer, which explains the inconsistent results. Microsoft, please update your documentation accordingly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM