简体   繁体   English

我的代码中要更改什么以计算样本标准偏差而不是总体标准偏差?

[英]What to change in my code to calculate sample Standard deviation instead of population standard deviaiton?

My code calculates the population deviation when I need it to calculate the sample deviation I have compared both formulas and tried changed my calculations but nothing seems to work. 我的代码在需要计算总体偏差时计算总体偏差,我已经比较了两个公式并尝试更改我的计算,但似乎没有任何效果。 Thanks for everyone's help or input in advance. 感谢大家的帮助或提前输入。

public class MeanAndStandardDeviation {
public static void main (String argv []) throws IOException {
    BufferedReader stdin =
            new BufferedReader (new InputStreamReader (System.in));
    NumberFormat nf = new DecimalFormat ("0.00");
    nf.setMinimumFractionDigits (2);//Sets Min digits
    nf.setMaximumFractionDigits (2);//Sets Max digits
    String inputValue;
    int count = 0;
    //For Loop for count
    for(int i = 0; i < count; i++){
        count++;
    }
    double varianceFinal = 0;
    List<String> input = new ArrayList<String>();//String ArrayList
    List<Double> numbers = new ArrayList<Double>();//Double ArrayList

  //While loop that takes in all my input and assigns it to the ArrayLists
  //Parameters set for when null is entered and total numbers go over 500
    while((inputValue = stdin.readLine()) != null && !inputValue.equals("") && input.size()<500){//Parameters set for when null is entered and total numbers go over 500
        input.add(inputValue);
        numbers.add (Double.parseDouble(inputValue));  
    }

System.out.println ("Standard Deviation: " +(nf.format(calcStdDev (numbers, count, varianceFinal))));//Prints the Standard Deviation
}

//StandardDeviation Class
static double calcStdDev (List<Double> numbers, int count, double variance){
    variance = 0;
    double sum = 0;
    for(int i = 0; i < numbers.size(); i++){
        sum += numbers.get(i);
        variance += numbers.get(i) * numbers.get(i);
        count++;
    }
    double varianceFinal = ((variance/count)-(sum*sum)/(count*count));
return Math.sqrt(varianceFinal);

}
} 

Seriously, your code is "wrong" on many levels. 严重的是,您的代码在许多级别上都是“错误的”。 So instead of debugging all of that for you, I will give you some hints how to fix and simplify your code - then it should be very easy for you to fix/resolve your actual math problem. 因此,除了为您调试所有代码外,我还将为您提供一些有关如何修复和简化代码的提示-然后,您应该很容易修复/解决实际的数学问题。

First of all, your code is so written in a confusing style that just makes it much harder to understand (and therefore debug ) than it needs to be. 首先,您的代码以一种令人困惑的样式编写,这使得理解(并因此调试 )的难度大大超过了需要。

Example: 例:

int count = 0;
//For Loop for count
for(int i = 0; i < count; i++){
    count++;
}

That for loop doesn't do anything . 那个for循环什么也没做。 And even when the condition would be something else, like i < someNumber ; 即使条件可能是其他情况,例如i < someNumber you would still just need to put count = someNumber there; 您仍然只需要在此处放置count = someNumber instead of looping! 而不是循环!

Same here: what is the point of providing count as argument to your calc methods?! 同样在这里:将count作为参数提供给calc方法有什么意义? And to then just increase it? 然后增加它? So, lets rewrite that: 因此,让我们重写一下:

public static double calcStdDev (List<Double> numbers, double variance) {
  double sumOfNumbers = 0;
  double sumOfSquares = 0;
  for(double number : numbers) {
    sumOfNumbers += number;
    sumOfSquares += number * number;
}
... and instead of calculating count, you simply have
int numberOfNumbers = numbers.size();
... and now, do your math

The other thing that is really strange in your code is how you setup your variance variable; 代码中真正奇怪的另一件事是如何设置方差变量。 and how it is used within your calc methods. 以及在calc方法中如何使用它。

Long story short: step back and remove everything from your code that isn't required. 长话短说:退后一步,从代码中删除不需要的所有内容。

It is a bad idea to compute the variance as you do. 像您那样计算方差是一个坏主意。 If the mean is largeish, eg 10 million, and the noise is smallish, eg around 1 then the limited precision of doubles may well mean that your computed variance is negative and the the sd will be nan. 如果平均值较大,例如1000万,而噪声较小,例如1,则double的有限精度很可能意味着您计算出的方差为负,而sd将为nan。

You should either compute it in two passes, eg 您应该分两步进行计算,例如

double mean = 0.0;
   for( i=0; i<n; ++i)
   {  mean += x[i];
   }
   mean /= n;
double var = 0.0;
   for( i=0; i<n; ++i)
   {   
   double d = x[i] - mean;
       var += d*d;
   }
   var /= n;

or in one pass, eg 或一次通过,例如

double mean = 0.0;
double var = 0.0;
  for( i=0; i<n; ++i)
  {  
  double f = 1.0/(i+1);
  double d = x[i]-mean;
      mean += d*f;
      var = (1.0-f)*(var + f*d*d);
  }

(it takes a bit of tedious algebra to show that the one pass method gives the same answer as the two pass method). (这需要一些乏味的代数才能证明一遍方法给出的答案与二遍方法相同)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM