简体   繁体   English

iOS上的语音识别-将OOV单词转换为iOS上的音素?

[英]voice recognition on iOS - convert OOV words to phonemes on iOS?

I've tried, as suggested on StackOverflow, Openears sucessfully, and generate custom vocabularies from arrays of NSSTRINGS. 我已经尝试过,如StackOverflow上所建议的那样,成功完成了Openears,并从NSSTRINGS数组生成了自定义词汇表。 However, we also need to recognize names from the addressbook, and here the fallback method inevitably fails miserably very often… 但是,我们还需要从通讯簿中识别名称,并且在这种情况下,后备方法不可避免地会经常失败,并且会失败……

I could write a parser and dynamically transcribe the texts (mainly French and Dutch sounding names) to phonemes myself, but that would be a lot of (guessing) work…. 我可以编写一个解析器,然后动态地将文本(主要是法语和荷兰语的发音名称)转录成音素,但这将是很多(猜测)的工作……。 I'm pretty sure the data I need is generated somewhere in the recognition process, so maybe someone could point me to a hook in OpenEars or Flite code in a way I can exploit on iOS? 我很确定我需要的数据是在识别过程中的某个位置生成的,所以也许有人可以用我可以在iOS上利用的方式将我指向OpenEars或Flite代码中的一个钩子?

Or some other library that would convert user speech to a string of phonemes I can feed into Openears? 还是其他一些可以将用户语音转换为我可以输入Openears的音素字符串的库?

The right way to recognize names in openears is to put specific pronunciations in the phonetic dictionary. 识别openears中名称的正确方法是将特定的发音放入语音词典中。 You do not need to analyze phonetic strings yourself and actually recognizer do not have information about phonetic string altogether so you can not even retrieve it. 您不需要自己分析语音字符串,而且实际上识别器完全没有有关语音字符串的信息,因此您甚至无法检索它。 Also, there is no clear correspondence between audio and phoneme sequence. 另外,音频和音素序列之间也没有明确的对应关系。

For example grapheme to phoneme code can defer the following pronunciaiton: 例如,音素到音素代码可以推迟以下发音:

tena    T IH N

While the correct pronunciation is 虽然正确的发音是

tena    T EH N AH

With incorrect pronunciation predicted the recognizer will not be able to recognize a name. 如果预测的发音不正确,识别器将无法识别名称。 With corrected it will recognize the name accurately 经过更正,它将可以准确识别名称

The problem is that automatic word to phoneme converion in openears might fail. 问题在于,openears中的自动单词到音素转换可能会失败。 For foreign words it might fail even more frequently. 对于外来词,它可能会更频繁地失败。 What you need to do is to add the names into the dictionary so that recognizer will know their proper phonetic sequencies. 您需要做的是将名称添加到字典中,以便识别器知道其正确的语音序列。 If proper sequence is known, the recognizer will be able to detect the word by itself. 如果知道正确的顺序,识别器将能够自己检测单词。 You can also improve grapheme to phoneme code in openears to make it more accurate. 您还可以改善笔形符号到openears中的音素代码,使其更准确。 Modern pocketsphinx uses phonetisaurus API which is both more accurate than flite and also trainable on special cases like foreign names. 现代的掌上狮身人面像使用phonetisaurus API,它不仅比flite更为准确,而且可以在特殊情况下(例如外来名称)进行训练。

For all the issues you have with accuracy first of all it's recommended to collect a database of test samples in order to enable stringaccuracy analysis. 对于准确性方面存在的所有问题,首先建议收集测试样本数据库以启用字符串精度分析。 Once you have such database you can improve accuracy significantly. 一旦有了这样的数据库,就可以大大提高准确性。 See for details 查看详情

http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor http://cmusphinx.sourceforge.net/wiki/faq#qwhy_my_accuracy_is_poor

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM