简体   繁体   中英

php-ml regression predicts weird values

My input values are 1, 2, 3, 4, ... and my output values are 1*1, 2*2, 3*3, 4*4, ... My code looks like this:

$reg = new LeastSquares();

$samples = array();
$targets = array();
for ($i = 1; $i < 100; $i++)
{  
  $samples[] = [$i];
  $targets[] = $i*$i;
}

$reg->train($samples, $targets);
  
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";

I expect it to output roughly 25 and 100. But I get:

-1183.3333333333
-683.33333333333

I also tried to use SVR instead of LeastSquares but the values are strange too:

2498.23
2498.23

I am new to ML. What am I doing wrong?

It looks like the code you provided is almost correct. There are a few issues that you need to fix in order to make it work:

You need to define the LeastSquares class and implement the train and predict methods. You need to include the necessary dependencies (eg libraries or other PHP files). You need to make sure that the $samples and $targets arrays are correctly initialized and filled with the desired values. Here is an example of how you can fix the code to perform least squares regression in PHP:

<?php

// include necessary dependencies
require_once 'Matrix.php';

class LeastSquares
{
// solve least squares problem using normal equations
public function train($samples, $targets)
{
// number of samples
$m = count($samples);

// number of features
$n = count($samples[0]);

// add ones column to samples
for ($i = 0; $i < $m; $i++)
  array_unshift($samples[$i], 1);

// transpose samples
$samples_t = Matrix::transpose($samples);

// compute dot product
$dot = Matrix::dot($samples_t, $samples);

// invert dot product
$inv = Matrix::inv($dot);

// compute dot product
$dot = Matrix::dot($inv, $samples_t);

// compute weights
$this->weights = Matrix::dot($dot, $targets);
}

// make a prediction
public function predict($sample)
{
// add ones to sample
array_unshift($sample, 1);

// compute prediction
return Matrix::dot($this->weights, $sample);
}
}

// create instance of LeastSquares class
$reg = new LeastSquares();

// create arrays of samples and targets
$samples = array();
$targets = array();
for ($i = 1; $i < 100; $i++)
 {
$samples[] = [$i];
$targets[] = $i*$i;
}

// fit a linear model to the samples using least squares
$reg->train($samples, $targets);

 // predict the output for two new samples
echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";

?>

As others have pointed out in the comments LeastSquares is for fitting a linear model to your data (training examples).

Your data set (target = samples^2) is inherently non-linear. If you try to picture what happens when you fit the best possible (in a least square of residuals sense) line to a quadratic curve you get a negative y-intercept (a sketch of this below):

在此处输入图像描述

You've trained your linear model on data up to x=99, y=9801, which will mean you have a very large y-intercept. So down at x=5 or x=10 you end up with a large negative value as you've found.

If you use support vector regression with a degree-2 polynomial it will do a good job of capturing the pattern of your data:

<?php
require_once __DIR__ . '/vendor/autoload.php';
use Phpml\Regression\SVR;
use Phpml\SupportVectorMachine\Kernel;

$samples = array();
$targets = array();
for ($i = 1; $i <= 100; $i++)
{  
  $samples[] = [$i];
  $targets[] = $i*$i;
}

$reg = new SVR(Kernel::POLYNOMIAL, $degree = 2);
$reg->train($samples, $targets);

echo $reg->predict([5])."\n";
echo $reg->predict([10])."\n";
?>

Returns:

25.0995
100.098

From your response in the comments its clear that you're looking to apply a neural.network so that you don't have to worry about what degree of model to fit to your data. A neural.network with a single hidden layer can fit any continuous function arbitrarily well with enough hidden nodes, and enough training data.

Unfortunately php-ml doesn't seem to have a MLP (multilayer perceptron - another term for a neural.network) for regression available out-of-the-box. I'm sure you could build one from appropriate layers but if your goal is to get up and running with training regression models quickly it might not be the best approach.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM