简体   繁体   English

如何正确设置brain.js神经网络

[英]How to properly set up brain.js Neural Network

I am using the Auto MPG training set from http://archive.ics.uci.edu/ml/datasets/Auto+MPG 我正在使用来自http://archive.ics.uci.edu/ml/datasets/Auto+MPG的Auto MPG训练集

My code is: 我的代码是:

'use strict';
var brain, fs, normalizeData, trainNetwork, _;

_ = require('lodash');

brain = require('brain');

fs = require('fs');

trainNetwork = function(trainNetworkCb) {
  var net;
  net = new brain.NeuralNetwork();
  return fs.readFile('./data/autodata.csv', function(err, fileData) {
    var fileString, lines, trainingData;
    if (err) {
      return trainNetworkCb(err);
    }
    fileString = fileData.toString();
    lines = fileString.split('\n');
    trainingData = lines.splice(0, lines.length / 2);
    trainingData = _.map(trainingData, function(dataPoint) {
      var normalizedData, obj;
      normalizedData = normalizeData(dataPoint);
      obj = {
        input: normalizedData,
        output: {
          continuous: normalizedData.continuous
        }
      };
      delete obj.input.continuous;
      return obj;
    });
    net.train(trainingData, {
      log: true,
      logPeriod: 100,
      errorThresh: 0.00005
    });
    return trainNetworkCb(null, net);
  });
};

trainNetwork(function(err, net) {
  if (err) {
    throw err;
  }
  return fs.readFile('./data/autodata.csv', function(err, fileData) {
    var fileString, lines, testData;
    if (err) {
      return trainNetworkCb(err);
    }
    fileString = fileData.toString();
    lines = fileString.split('\n');
    testData = lines.splice(lines.length / 2);
    testData = _.filter(testData, function(point) {
      return point !== '';
    });
    testData = _.map(testData, function(dataPoint) {
      var normalizedData, obj;
      normalizedData = normalizeData(dataPoint);
      obj = {
        output: {
          continuous: normalizedData.continuous
        },
        input: normalizedData
      };
      delete obj.input.continuous;
      return obj;
    });
    return _.each(testData, function(dataPoint) {
      var output;
      output = net.run(dataPoint.input);
      console.log(output);
      console.log(dataPoint);
      return console.log('');
    });
  });
});

normalizeData = function(dataRow) {
  var cylinders, dataSet, model_years, origins, row;
  dataSet = dataRow.split(',');
  dataSet = _.map(dataSet, function(point) {
    return Number(point);
  });
  row = {};
  cylinders = [5, 3, 6, 4, 8];
  _.each(cylinders, function(cylinder) {
    row["cylinder" + cylinder] = cylinder === dataSet[0] ? 1 : 0;
  });
  row.displacement = dataSet[1] / 500;
  row.horsepower = dataSet[2] / 500;
  row.weight = dataSet[3] / 10000;
  row.acceleration = dataSet[4] / 100;
  model_years = [82, 81, 80, 79, 78, 77, 76, 75, 74, 73, 72, 71, 70];
  _.each(model_years, function(model_year) {
    row["model_year" + model_year] = model_year === dataSet[5] ? 1 : 0;
  });
  origins = [2, 3, 1];
  _.each(origins, function(origin) {
    row["origin" + origin] = origin === dataSet[6] ? 1 : 0;
  });
  row.continuous = dataSet[7] / 100;
  return row;
};

I believe I am normalizing everything correctly. 我相信我正确地将一切正常化。 I am using half the data for training and the other half for testing. 我使用一半的数据进行培训,另一半用于测试。 The data is not ordered, as far as I can tell, so which half is used for which shouldn't matter. 据我所知,数据没有订购,所以哪一半用于哪个无关紧要。

My errors are pretty large however when testing. 我的错误非常大,但在测试时。 Usually by 10MPG or so (30% error). 通常由10MPG左右(30%误差)。 What am I doing incorrectly? 我做错了什么?

Thanks 谢谢

The dataset you linked is ordered by model-year; 您链接的数据集按型号年份排序; perhaps drastic changes in technology made the engines more efficient? 或许技术的巨大变化使发动机更有效率? Neural networks are dependent on correct outputs during training. 神经网络在训练期间依赖于正确的输出。 I would try training the network with all but the last row, and then test using that. 我会尝试使用除最后一行之外的所有网络训练网络,然后使用它进行测试。 Can you link me the csv file you're using? 你能把你正在使用的csv文件链接到我吗? The normalizeData function doesn't give us what you want with the linked file ( http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data ) normalizeData函数不能为我们提供链接文件的所需内容( http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data

edit: 编辑:

It seems like regardless of whatever errorThresh you specify, brain won't run more than 20,000 iterations on training runs. 看起来无论你指定的是什么errorThresh ,大脑都不会在训练运行中运行超过20,000次迭代。 There's several ways to get around this. 有几种方法可以解决这个问题。 You can specify the learningRate of your neural network. 您可以指定神经网络的learningRate Upping the learningRate to 0.6 (default is 0.3) helped me get more accurate results 将learningRate提高到0.6(默认值为0.3)可以帮助我获得更准确的结果

net.train(trainingData, {
  log: true,
  logPeriod: 100,
  errorThresh: 0.00005,
  learningRate: 0.6
});

Higher learningRate means more aggressive weight adjustment, which helps when you aren't running as many iterations as you want. 更高的learningRate意味着更积极的体重调整,这有助于您没有运行任意数量的迭代。

Alternatively, you can specify the total amount of iterations in the options object (if not specified, it defaults to 20,000 - see here ). 或者,您可以在options对象中指定迭代总量(如果未指定,则默认为20,000 - 请参阅此处 )。

net.train(trainingData, {
  log: true,
  logPeriod: 100,
  errorThresh: 0.00005,
  iterations: 100000
});

Brain stops training when i < iterations && error > errorThresh evaluates to false. i < iterations && error > errorThresh评估为false时,脑停止训练。 So feel free to crank up the iterations count to ensure that the above expression turns false because the error is below your specified errorTresh ( source ). 因此,请随意调整迭代次数以确保上述表达式变为false,因为error低于指定的errorTreshsource )。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM