简体   繁体   English

如何正确训练我的神经网络

[英]How to correctly train my Neural Network

I'm trying to teach a neural network to decide where to go based on its inputted life level . 我正在尝试教一个神经网络,根据输入的生命水平决定去哪里。 The neural network will always receive three inputs [x, y, life] . 神经网络将始终接收三个输入[x, y, life] If life => 0.2 , it should output the angle from [x, y] to (1, 1) . 如果life => 0.2 ,它应该输出从[x, y](1, 1) If life < 0.2 , it should output the angle from [x, y] to (0, 0) . 如果life < 0.2 ,则应输出从[x, y](0, 0)

As the inputs and outputs of neurons should be between 0 and 1 , I divide the angle by 2 *Math.PI . 由于神经元的输入和输出应该在01之间,我将角度除以2 *Math.PI

Here is the code: 这是代码:

var network = new synaptic.Architect.Perceptron(3,4,1);

for(var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  for(var j = 0; j < 100; j++){
    network.activate([x,y,j/100]);
    if(j < 20){
      network.propagate(0.3, [angle1]);
    } else {
      network.propagate(0.3, [angle2]);
    }
  }
}

Try it out here: jsfiddle 在这里尝试一下: jsfiddle

So when I enter the following input [0, 1, 0.19] , I expect the neural network to output something close to [0.75] ( 1.5PI / 2PI ). 因此,当我输入以下输入[0, 1, 0.19] 1.5PI / 2PI [0, 1, 0.19] ,我希望神经网络输出接近[0.75]1.5PI / 2PI )的1.5PI / 2PI But my results are completely inconsistent and show no correlation with any input given at all. 但是我的结果完全不一致,并且与任何输入都没有任何关联。

What mistake am I making in teaching my Neural network? 我在教授神经网络时犯了什么错误?

I have managed to teach a neural network to output 1 when input [a, b, c] with c => 0.2 and 0 when input [a, b, c] with c < 0.2 . 我已成功地教神经网络来输出1时输入[a, b, c]c => 0.20时输入[a, b, c]c < 0.2 I have also managed to teach it to output an angle to a certain location based on [x, y] input, however I can't seem to combine them . 我还设法教它根据[x, y]输入将角度输出到某个位置,但是我似乎无法将它们组合起来


As requested, I have written some code that uses 2 Neural Networks to get the desired output. 根据要求,我编写了一些代码,使用2个神经网络来获得所需的输出。 The first neural network converts life level to a 0 or a 1, and the second neural network outputs an angle depending on the 0 or 1 it got outputted from the first neural network. 第一神经网络将生命水平转换为0或1,并且第二神经网络根据从第一神经网络输出的0或1输出角度。 This is the code: 这是代码:

// This network outputs 1 when life => 0.2, otherwise 0
var network1 = new synaptic.Architect.Perceptron(3,3,1);
// This network outputs the angle to a certain point based on life
var network2 = new synaptic.Architect.Perceptron(3,3,1);

for (var i = 0; i < 50000; i++){
  var x = Math.random();
  var y = Math.random();
  var angle1 = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  var angle2 = angleToPoint(x, y, 1, 1) / (2 * Math.PI);

  for(var j = 0; j < 100; j++){
    network1.activate([x,y,j/100]);
    if(j < 20){
      network1.propagate(0.1, [0]);
    } else {
      network1.propagate(0.1, [1]);
    }
     network2.activate([x,y,0]);
    network2.propagate(0.1, [angle1]);
    network2.activate([x,y,1]);
    network2.propagate(0.1, [angle2]);
  }
}

Try it out here: jsfiddle 在这里尝试一下: jsfiddle

As you can see in this example. 正如您在此示例中看到的那样。 It manages to reach the desired output quite closely, by adding more iterations it will come even closer. 它设法非常接近地达到期望的输出,通过添加更多迭代它将更加接近。

Observations 意见

  1. Skewed Distribution sampled as Training set 倾斜分布作为训练集采样

    Your training set is choosing the life parameter inside for(var j = 0; j < 100; j++) , which is highly biased towards j>20 and consequently life>0.2 . 你的训练集是选择里面的life参数for(var j = 0; j < 100; j++) ,它高度偏向于j>20 ,因此life>0.2 It has 4 times more training data for that subset, which makes your training function prioritize. 它为该子集提供了4倍的训练数据,这使您的训练功能优先。

  2. Non-shuffled training data 非混乱的训练数据

    You are training sequentially against the life parameter, which can be harmful. 您正在按life参数顺序训练,这可能是有害的。 You network will end up giving more attention to the bigger j s since it's the most recent reason for network propagations. 你的网络将最终关注更大的j s,因为它是网络传播的最新原因。 You should shuffle your training set to avoid this bias. 你应该改变你的训练集以避免这种偏见。

    This will stack with the previous point, because you're again giving more attention to some subset of life values. 这将与前一点叠加,因为您再次关注life值的某些子集。

  3. You should measure your training performance as well 您还应该衡量您的训练表现

    Your network, despite previous observations, was not really that bad. 尽管以前有过观察,你的网络并不是那么糟糕。 Your training error was not as huge as your tests. 您的训练错误不如您的测试那么大。 This discrepancy usually means that you're training and testing on different sample distributions. 这种差异通常意味着您正在对不同的样本分布进行培训和测试。

    You could say that you have two classes of data points: the ones with life>0.2 and the others not. 你可以说你有两类数据点: life>0.2而其他数据点不是。 But because you introduced a discontinuity in the angleToPoint function, I'd recommend that you separate in three classes: keep a class for life<0.2 (because the function behaves continuously) and split life>0.2 in "above (1,1)" and "below (1,1)." 但是因为你在angleToPoint函数中引入了一个不连续性,我建议你将它分成三个类:保持一个life<0.2life<0.2 (因为函数连续运行)并在“above(1,1)”中拆分life>0.2和“下面(1,1)。”

  4. Network complexity 网络复杂性

    You could successfully train a network for each task separately. 您可以分别为每个任务成功训练网络。 Now you want to stack them. 现在你想要堆叠它们。 This is quite the purpose of deep learning: each layer builds on the concepts perceived by the previous layer, therefore increasing the complexity of the concepts it can learn. 这是深度学习的目的:每一层都建立在前一层感知的概念之上,因此增加了它可以学习的概念的复杂性。

    So instead of using 20 nodes in a single layer, I'd recommend that you use 2 layers of 10 nodes. 因此,我建议您使用2层10个节点,而不是在单个层中使用20个节点。 This matches the classes hierarchy I mentioned in the previous point. 这与我在前一点中提到的类层次结构相匹配。

The Code 代码

Running this code I had a training/testing error of 0.0004 / 0.0002 . 运行此代码我的训练/测试误差为0.0004 / 0.0002

https://jsfiddle.net/hekqj5jq/11/ https://jsfiddle.net/hekqj5jq/11/

var network = new synaptic.Architect.Perceptron(3,10,10,1);
var trainer = new synaptic.Trainer(network);
var trainingSet = [];

for(var i = 0; i < 50000; i++){
  // 1st category: above vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(x, 1.0);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 2nd category: below vector (1,1), measure against (1,1)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, x);
  var z = getRandom(0.2, 1);
  var angle = angleToPoint(x, y, 1, 1) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
  // 3rd category: above/below vector (1,1), measure against (0,0)
  var x = getRandom(0.0, 1.0);
  var y = getRandom(0.0, 1.0);
  var z = getRandom(0.0, 0.2);
  var angle = angleToPoint(x, y, 0, 0) / (2 * Math.PI);
  trainingSet.push({input: [x,y,z], output: [angle]});
}

trainer.train(trainingSet, {
    rate: 0.1,
    error: 0.0001,
    iterations: 50,
    shuffle: true,
    log: 1,
    cost: synaptic.Trainer.cost.MSE
});

testSet = [
    {input: [0,1,0.25], output: [angleToPoint(0, 1, 1, 1) / (2 * Math.PI)]},
    {input: [1,0,0.35], output: [angleToPoint(1, 0, 1, 1) / (2 * Math.PI)]},
    {input: [0,1,0.10], output: [angleToPoint(0, 1, 0, 0) / (2 * Math.PI)]},
    {input: [1,0,0.15], output: [angleToPoint(1, 0, 0, 0) / (2 * Math.PI)]}
];

$('html').append('<p>Train:</p> ' + JSON.stringify(trainer.test(trainingSet)));
$('html').append('<p>Tests:</p> ' + JSON.stringify(trainer.test(testSet)));

$('html').append('<p>1st:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.25]));

$('html').append('<p>2nd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 1, 1) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.25]));

$('html').append('<p>3rd:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(0, 1, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([0, 1, 0.15]));

$('html').append('<p>4th:</p> ')
$('html').append('<p>Expect:</p> ' + angleToPoint(1, 0, 0, 0) / (2 * Math.PI));
$('html').append('<p>Received: </p> ' + network.activate([1, 0, 0.15]));

function angleToPoint(x1, y1, x2, y2){
  var angle = Math.atan2(y2 - y1, x2 - x1);
  if(angle < 0){
    angle += 2 * Math.PI;
  }
  return angle;
}

function getRandom (min, max) {
    return Math.random() * (max - min) + min;
}

Further Remarks 进一步的评论

As I mentioned in the comments and in the chat, there's no such a thing as "angle between (x,y) and (0,0)", because the notion of angle between vectors is usually taken as the difference between their directions and (0,0) has no direction. 正如我在评论和聊天中提到的,没有“(x,y)和(0,0)之间的角度”,因为矢量之间的角度概念通常被视为它们的方向和(0,0)没有方向。

Your function angleToPoint(p1, p2) returns instead the direction of (p1-p2). 您的函数angleToPoint(p1, p2)返回(p1-p2)的方向。 For p2 = (0,0) , that means the angle between p1 and the x axis alright. 对于p2 = (0,0) ,这意味着p1和x轴之间的角度正好。 But for p1= (1,1) and p2= (1,0) it will not return 45 degrees. 但是对于p1 = (1,1)和p2 = (1,0)它不会返回45度。 For p1=p2, it's undefined instead of zero. 对于p1 = p2,它是未定义的而不是零。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM