[英]Recognition of a sound (a word) with machine learning in python
I'm preparing an experiment, and I want to write a program using python to recognize certain word spoken by the participants. 我正在准备一个实验,我想使用python编写程序以识别参与者说出的某些单词。
I searched a lot about speech recognition in python but the results are complicated.(eg CMUSphinx). 我在python中搜索了很多有关语音识别的内容,但结果却很复杂(例如CMUSphinx)。
What I want to achieve is a program, that receive a sound file (contains only one word, not English), and I tell the program what the sound means and what output I want to see. 我要实现的是一个程序,该程序接收一个声音文件(仅包含一个单词,不包含英语),然后告诉程序声音的含义和想要看到的输出。
I have seen the sklearn example about recognizing hand-written digits. 我看过有关识别手写数字的sklearn 示例 。 I want to know if I can do something like the example: 我想知道是否可以执行以下示例:
Can I do this with python and sklearn? 我可以使用python和sklearn吗? If so, where should I start? 如果是这样,我应该从哪里开始?
Thank you! 谢谢!
I've written such program in text recognition. 我已经在文本识别中编写了这样的程序。 I can tell you if you chose to "teach" your program manually you will have a lot of work think about the variation in voice due to accents etc. 我可以告诉您,如果您选择手动“教”您的程序,您将有很多工作要考虑由于重音等引起的语音变化。
You could start looking for a sound analyzer here (Musical Analysis). 您可以在这里开始寻找声音分析仪 (音乐分析)。 try to identify the waves of a simple word like "yes" and write an alghorithm that percentages the variation of the soundfile. 尝试识别一个简单单词(如“是”)的波动,然后编写一个算法,将声音文件的变化百分比化。 this way you can put a margin in to safe yourself from false-positives / vice-versa. 这样,您就可以保证自己免受假阳性的影响,反之亦然。
Also you might need to remove background noise from the soundfile first as they may interfer with your identification patterns. 另外,您可能需要先从声音文件中删除背景噪音,因为它们可能会干扰您的识别模式。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.