简体   繁体   中英

Apache Math generate distribution from data set

In Java's Apache Math library, is there any way to take a set of data points and generate a distribution object from it? More specifically, I am trying to create a BetaDistribution object from a set of data but the only way to create one is by passing in an alpha and a beta in as it's parameters. Do I have to manually figure out these values from the data or is there something in apache math that will fit those values for me?

From the smile project

    public BetaDistribution(double[] data) {
        for (int i = 0; i < data.length; i++) {
            if (data[i] < 0 || data[i] > 1) {
                throw new IllegalArgumentException("Samples are not in range [0, 1].");
            }
        }

        mean = Math.mean(data);
        var = Math.var(data);

        alpha = mean * (mean * (1 - mean) / var - 1);
        beta = (1 - mean) * (mean * (1 - mean) / var - 1);
        if (alpha <= 0 || beta <= 0) {
            throw new IllegalArgumentException("Samples don't follow Beta Distribution.");
        }

        mean = alpha / (alpha + beta);
        var = alpha * beta / ((alpha + beta) * (alpha + beta) * (alpha + beta + 1));
        entropy = Math.log(Beta.beta(alpha, beta)) - (alpha - 1) * Gamma.digamma(alpha) - (beta - 1) * Gamma.digamma(beta) + (alpha + beta - 2) * Gamma.digamma(alpha + beta);
    }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM