简体   繁体   English

在Matlab中使用LPC和ANN进行语音识别

[英]Speech Recognition using LPC and ANN in Matlab

I have audio records of 4 phonemes (a, e, o, u) from 11 people. 我有11个人的4个音素(a,e,o,u)的音频记录。 I trained an ANN using the data from 10 people, and used the other set for testing. 我使用来自10个人的数据训练了ANN,然后将另一组用于测试。 I used 14 LPC coefficients of the first period (20ms) of records as features. 我使用记录的第一时间段(20毫秒)的14个LPC系数作为特征。

The training matrix I has 14 rows and 10 columns for each phoneme. 训练矩阵I每个音素有14行和10列。 So it is 14*40. 所以是14 * 40。 Since it is a supervised classification problem, I constructed a target matrix T which is 4*40. 由于这是一个监督分类问题,因此我构建了一个目标矩阵T ,该目标矩阵T为4 * 40。 It contains ones and zeros where a 1 indicates that the corresponding column in I is from that class. 它包含1和0,其中1表示I中的相应列来自该类。

The test data matrix contains four columns and 14 rows as it contains 4 phonemes from only one person. 测试数据矩阵包含四列14行,因为它仅包含一个人的四个音素。 Let us call it S . 让我们称之为S

Here is the code: 这是代码:

net = newff(I, T, 15);
net = init(net);
net.trainParam.epochs = 10000;
net.trainParam.goal = 0.01;
net = train(net, I, T);
y1 = sim(net, I);
y2 = sim(net, S)

The results are not good even I give the training data as test data (y1). 即使我将训练数据作为测试数据(y1),结果也不是很好。

What is wrong here? 怎么了

I used 14 LPC coefficients of the first period (20ms) of records as features. 我使用记录的第一时间段(20毫秒)的14个LPC系数作为特征。

So did you ignore almost all the sound data except first 20ms? 那么,除了前20ms之外,您是否忽略了几乎所有声音数据? It doesn't sound right. 听起来不对。 You must have calculate an average over all frames at least. 您必须至少计算所有帧的平均值。

What is wrong here? 怎么了

You started coding without understanding a theory. 您是在不了解理论的情况下开始编码的。 Probably you want to read some introduction first. 可能您想先阅读一些介绍。 At least this and ideally this 至少这个理想情况下

To understand why ANN doesn't work calculate how many parameters are required to map 10 features to 4 classes, then calculate how many training vectors do you have for every parameter. 要了解ANN为什么不起作用,需要计算将10个要素映射到4个类所需的参数数量,然后计算每个参数需要多少训练向量。 Take into account that for every parameter you need at least 10 samples for initial estimation. 考虑到每个参数至少需要10个样本才能进行初始估算。 That means your training data is not enough. 这意味着您的训练数据还不够。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM