简体   繁体   中英

voice recognition on iOS - convert OOV words to phonemes on iOS?

I've tried, as suggested on StackOverflow, Openears sucessfully, and generate custom vocabularies from arrays of NSSTRINGS. However, we also need to recognize names from the addressbook, and here the fallback method inevitably fails miserably very often…

I could write a parser and dynamically transcribe the texts (mainly French and Dutch sounding names) to phonemes myself, but that would be a lot of (guessing) work…. I'm pretty sure the data I need is generated somewhere in the recognition process, so maybe someone could point me to a hook in OpenEars or Flite code in a way I can exploit on iOS?

Or some other library that would convert user speech to a string of phonemes I can feed into Openears?

The right way to recognize names in openears is to put specific pronunciations in the phonetic dictionary. You do not need to analyze phonetic strings yourself and actually recognizer do not have information about phonetic string altogether so you can not even retrieve it. Also, there is no clear correspondence between audio and phoneme sequence.

For example grapheme to phoneme code can defer the following pronunciaiton:

tena    T IH N

While the correct pronunciation is

tena    T EH N AH

With incorrect pronunciation predicted the recognizer will not be able to recognize a name. With corrected it will recognize the name accurately

The problem is that automatic word to phoneme converion in openears might fail. For foreign words it might fail even more frequently. What you need to do is to add the names into the dictionary so that recognizer will know their proper phonetic sequencies. If proper sequence is known, the recognizer will be able to detect the word by itself. You can also improve grapheme to phoneme code in openears to make it more accurate. Modern pocketsphinx uses phonetisaurus API which is both more accurate than flite and also trainable on special cases like foreign names.

For all the issues you have with accuracy first of all it's recommended to collect a database of test samples in order to enable stringaccuracy analysis. Once you have such database you can improve accuracy significantly. See for details

http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM