简体   繁体   English

趋势线(回归、曲线拟合)java库

[英]Trend lines ( regression, curve fitting) java library

I'm trying to develop an application that would compute the same trend lines that excel does, but for larger datasets.我正在尝试开发一个应用程序,该应用程序可以计算与 excel 相同的趋势线,但适用于更大的数据集。

在此处输入图片说明

But I'm not able to find any java library that calculates such regressions.但是我找不到任何计算此类回归的 Java 库。 For the linera model I'm using Apache Commons math, and for the other there was a great numerical library from Michael Thomas Flanagan but since january it is no longer available:对于 linera 模型,我使用的是 Apache Commons 数学,对于另一个模型,Michael Thomas Flanagan 提供了一个很棒的数值库,但自 1 月以来它不再可用:

http://www.ee.ucl.ac.uk/~mflanaga/java/ http://www.ee.ucl.ac.uk/~mflanaga/java/

Do you know any other libraries, code repositories to calculate these regressions in java.你知道任何其他库,代码存储库来计算 java.lang 中的这些回归吗? Best,最好,

Since they're all based on linear fits, OLSMultipleLinearRegression is all you need for linear, polynomial, exponential, logarithmic, and power trend lines.由于它们都基于线性拟合,OLSMultipleLinearRegression 是线性、多项式、指数、对数和幂趋势线所需的全部。

Your question gave me an excuse to download and play with the commons math regression tools, and I put together some trend line tools:你的问题给了我下载和使用公共数学回归工具的借口,我整理了一些趋势线工具:

An interface:一个接口:

public interface TrendLine {
    public void setValues(double[] y, double[] x); // y ~ f(x)
    public double predict(double x); // get a predicted y for a given x
}

An abstract class for regression-based trendlines:基于回归的趋势线的抽象类:

public abstract class OLSTrendLine implements TrendLine {

    RealMatrix coef = null; // will hold prediction coefs once we get values

    protected abstract double[] xVector(double x); // create vector of values from x
    protected abstract boolean logY(); // set true to predict log of y (note: y must be positive)

    @Override
    public void setValues(double[] y, double[] x) {
        if (x.length != y.length) {
            throw new IllegalArgumentException(String.format("The numbers of y and x values must be equal (%d != %d)",y.length,x.length));
        }
        double[][] xData = new double[x.length][]; 
        for (int i = 0; i < x.length; i++) {
            // the implementation determines how to produce a vector of predictors from a single x
            xData[i] = xVector(x[i]);
        }
        if(logY()) { // in some models we are predicting ln y, so we replace each y with ln y
            y = Arrays.copyOf(y, y.length); // user might not be finished with the array we were given
            for (int i = 0; i < x.length; i++) {
                y[i] = Math.log(y[i]);
            }
        }
        OLSMultipleLinearRegression ols = new OLSMultipleLinearRegression();
        ols.setNoIntercept(true); // let the implementation include a constant in xVector if desired
        ols.newSampleData(y, xData); // provide the data to the model
        coef = MatrixUtils.createColumnRealMatrix(ols.estimateRegressionParameters()); // get our coefs
    }

    @Override
    public double predict(double x) {
        double yhat = coef.preMultiply(xVector(x))[0]; // apply coefs to xVector
        if (logY()) yhat = (Math.exp(yhat)); // if we predicted ln y, we still need to get y
        return yhat;
    }
}

An implementation for polynomial or linear models:多项式或线性模型的实现:

(For linear models, just set the degree to 1 when calling the constructor.) (对于线性模型,只需在调用构造函数时将度数设置为 1。)

public class PolyTrendLine extends OLSTrendLine {
    final int degree;
    public PolyTrendLine(int degree) {
        if (degree < 0) throw new IllegalArgumentException("The degree of the polynomial must not be negative");
        this.degree = degree;
    }
    protected double[] xVector(double x) { // {1, x, x*x, x*x*x, ...}
        double[] poly = new double[degree+1];
        double xi=1;
        for(int i=0; i<=degree; i++) {
            poly[i]=xi;
            xi*=x;
        }
        return poly;
    }
    @Override
    protected boolean logY() {return false;}
}

Exponential and power models are even easier:指数和幂模型更简单:

(note: we're predicting log y now -- that's important. Both of these are only suitable for positive y) (注意:我们现在正在预测 log y——这很重要。这两个都只适用于正 y)

public class ExpTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,x};
    }

    @Override
    protected boolean logY() {return true;}
}

and

public class PowerTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,Math.log(x)};
    }

    @Override
    protected boolean logY() {return true;}

}

And a log model:和日志模型:

(Which takes the log of x but predicts y, not ln y) (取 x 的对数但预测 y,而不是 ln y)

public class LogTrendLine extends OLSTrendLine {
    @Override
    protected double[] xVector(double x) {
        return new double[]{1,Math.log(x)};
    }

    @Override
    protected boolean logY() {return false;}
}

And you can use it like this:你可以像这样使用它:

public static void main(String[] args) {
    TrendLine t = new PolyTrendLine(2);
    Random rand = new Random();
    double[] x = new double[1000*1000];
    double[] err = new double[x.length];
    double[] y = new double[x.length];
    for (int i=0; i<x.length; i++) { x[i] = 1000*rand.nextDouble(); }
    for (int i=0; i<x.length; i++) { err[i] = 100*rand.nextGaussian(); } 
    for (int i=0; i<x.length; i++) { y[i] = x[i]*x[i]+err[i]; } // quadratic model
    t.setValues(y,x);
    System.out.println(t.predict(12)); // when x=12, y should be... , eg 143.61380202745192
}

Since you just wanted trend lines, I dismissed the ols models when I was done with them, but you might want to keep some data on goodness of fit, etc.由于您只需要趋势线,因此在完成 ols 模型后,我放弃了它们,但您可能希望保留一些关于拟合优度等的数据。

For implementations using moving average, moving median, etc, it looks like you can stick with commons math.对于使用移动平均、移动中位数等的实现,看起来您可以坚持使用公共数学。 Try DescriptiveStatistics and specify a window.尝试DescriptiveStatistics并指定一个窗口。 You might want to do some smoothing, using interpolation as suggested in another answer.您可能想要使用另一个答案中建议的插值来进行一些平滑处理。

您可以使用org.apache.commons.math3.analysis.interpolation可用的不同类型的插值器,包括例如 LinearInterpolator、LoessInterpolator 和 NevilleInterpolator。

In addition to what maybeWeCouldStealAVa said;除了也许WeCouldStealAVa所说的;

The commons-math3 library is also available in the maven repository . commons-math3 库也在maven 存储库中可用。

Current version is 3.2 and the dependency tag is:当前版本是 3.2,依赖标签是:

    <dependency>
        <groupId>org.apache.commons</groupId>
        <artifactId>commons-math3</artifactId>
        <version>3.2</version>
    </dependency>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM