简体   繁体   English

Java,BigDecimal:如何对舍入错误进行单元测试?

[英]Java, BigDecimal: How do I unit-test for rounding errors?

I will give a simplified example of my actual situation: 我将简要介绍一下我的实际情况:

Let's say I have to implement some code in Java for calculating the Weighted arithmetic mean . 假设我必须用Java实现一些代码来计算加权算术平均值 I am given two arrays of floating point values (expressed as doubles), each of the same length, the first containing the values and the 2nd containing their respective weights. 我给出了两个浮点值数组(表示为双精度数),每个数组长度相同,第一个包含值,第二个包含各自的权重。

Let's also say I make some implementation, that returns a floating point value (also a double) representing the Weighted arithmetic mean of the input values: 我们还说我做了一些实现,它返回一个表示输入值的加权算术平均值的浮点值(也是一个double):

public static double calculateWeightedArithmeticMean(double[] values, 
        double[] weights) {

    if(values.length != weights.length) {
        throw new IllegalArgumentException();
    }

    if(values.length == 0) {
        return 0;
    }

    if(values.length == 1) {
        return new BigDecimal(values[0]).setScale(1, RoundingMode.HALF_UP).
                doubleValue();
    }

    BigDecimal dividend = BigDecimal.ZERO;
    BigDecimal divisor = BigDecimal.ZERO;
    for(int i = 0; i < values.length; i++) {
        dividend = dividend.add(new BigDecimal(values[i]).
                multiply(new BigDecimal(weights[i])));
        divisor = divisor.add(new BigDecimal(weights[i]));
    }
    if(dividend.compareTo(BigDecimal.ZERO) == 0) {
        return 0d;
    }
    return dividend.divide(divisor, 1, RoundingMode.HALF_UP).doubleValue();
}

I write a unit test passing a few values (like, 3 values + 3 weights). 我写了一个单元测试,传递了一些值(比如,3个值+3个权重)。 I first make a manual calculation of their Weighted arithmetic mean (using a calculator) and then write a unit test that checks that my code returns that value. 我首先手动计算他们的加权算术平均值(使用计算器),然后编写一个单元测试,检查我的代码是否返回该值。

I believe that such a test is not pertinent for a situation where the number of values used is substantially larger, due to rounding errors. 我认为,由于舍入误差,这种测试对于所使用的值的数量要大得多的情况是不相关的。 Maybe the code I've implemented works well for 3 values + 3 weights (for a given precision) because the rounding error is less than the precision in this case, but it's very possible that the rounding error becomes greater than the desired precision for 1000 values + 1000 weights. 也许我实现的代码适用于3个值+ 3个权重(对于给定的精度),因为舍入误差小于这种情况下的精度,但是舍入误差很可能大于1000的所需精度。值+ 1000个权重。

My question is: 我的问题是:

  • should I also write an unit test that checks for a very large number of values (a "worst case scenario" for production use) ? 我是否还应该编写一个单元测试来检查大量的值(生产使用的“最坏情况”)?
  • if I should, how do I write it? 如果我应该,我该怎么写呢? How to I obtain the correct value, so that I can use it in my tests assertion(s) (calculating the means by hand for 2x1000 values seems like kind of a bad idea, even if using an Weighted mean calculator ...)? 如何获得正确的值,以便我可以在我的测试断言中使用它(手动计算2x1000值的均值似乎是一个坏主意,即使使用加权平均值计算器 ......)?
  • the same goes for similar scenarios: calculating the geometric mean , etc... 类似的场景也是如此:计算几何平均值等...

When writing unit tests, you always have to give up somewhere. 在编写单元测试时,你总是不得不放弃某个地方。 The trick is to give up when you're confident that you know enough :-) 诀窍是当你有足够的信心时放弃:-)

In your case, a few simple test cases are: 在您的情况下,一些简单的测试用例是:

  • Empty arrays 空数组
  • Create a second algorithm which uses precise arithmethics (like BigDecimal input arrays) to calculate error margins for selected inputs 创建第二个算法,该算法使用精确的算术(如BigDecimal输入数组)来计算所选输入的误差范围
  • Two arrays which are filled with the same values. 两个数组填充相同的值。 That way, you know the result (it should be the same as the first pair alone). 这样,你知道结果(它应该与单独的第一对相同)。
    • Try to find a pair of numbers which cause large rounding errors (like 1/10, 0.1/1, 0.2/2, which all end up as 0.1 which can't be represented properly using double; see here ) 尝试找到导致大舍入误差的一对数字(如1 / 10,0.1 / 1,0.2 / 2,最终都为0.1,无法使用double正确表示; 请参见此处
  • Create input arrays which contain random variances (ie +- 1% * rand()). 创建包含随机差异的输入数组(即+ - 1%* rand())。 These should even out as you grow the input arrays. 随着输入数组的增长,这些应该均匀。

When comparing the results, use assertEquals(double, double, double) where the first two are the values to compare and the last one is the precision ( 1e-3 for 3 digits after the comma). 比较结果时,使用assertEquals(double, double, double) ,其中前两个是要比较的值,最后一个是精度(逗号后3个数字为1e-3 )。

And lastly, you need to use the algorithm and see how it behaves. 最后,您需要使用该算法并查看其行为方式。 When you find a problem, then add a test case for this specific case. 当您发现问题时,请为此特定情况添加测试用例。

Yes you should. 是的你应该。 Testing (should) always involves boundary values. 测试(应该)总是涉及边界值。

You can provide an epsilon boundary for which you assert that an answer is (approximately) correct. 您可以提供一个epsilon边界,您断言答案是(大致)正确的。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM