简体   繁体   English

在Java中比较两个csv文件,并为double做近似匹配

[英]Comparing two csv files in java and do approximate match for double

I need to write a JUnit test which will compare two csv files of same format and will pass only if their absolute difference is less than threshold. 我需要编写一个JUnit测试,该测试将比较两个相同格式的csv文件,并且仅在它们的绝对差小于阈值时才通过。 I need exact match for strings and for double,it should satisfy threshold criteria. 我需要完全匹配的字符串和双精度字,它应该满足阈值条件。

CSV FORMAT: CSV格式:

first.csv
 Name    price-1    price-2
 item1    5.12       6.12
 item2    4.23       5.56
 item3    11.2       12.23

second.csv

 Name    price-1     price-2
 item1    5.12       6.10
 item2    4.20       5.50
 item3    11.19      12.19

Now lets say difference threshold is 0.15. 现在,假设差异阈值为0.15。 so here absolute difference between price1 of item2 in first.csv and second.csv is 0.03 then it will pass JUnit test and if difference threshold is 0.02 then it will fail. 因此,此处first.csv和second.csv中item2的price1之间的绝对差为0.03,则它将通过JUnit测试,并且如果差阈值是0.02,则它将失败。

what can be good solution for it? 有什么好的解决方案?

When you use assertEquals with double you can pass in a threshold. 当您将assertEquals与double一起使用时,您可以传递一个阈值。 This is called the delta in junit speak. 这在junit中称为增量。

Or you can use 或者你可以使用

assertTrue (Math.abs(val1 - val2) < threshold);

so in your example 所以在你的例子中

price 2 is 6.12 and 6.10 价格2是6.12和6.10

in the first one you could use 在第一个中,您可以使用

assertEquals(6.12d, 6.10d, 0.15)

this would pass 这会过去

or 要么

assertEquals(Math.abs(6.12d - 6.10d) < 0.15)

this would pass. 这会过去。

I would recommend playing around with assertEquals and plugging in numbers so you understand the assertEquals overloaded methods 我建议您使用assertEquals并插入数字,以便您了解assertEquals重载方法

As you are reading from a file you are likely to read string. 当您从文件中读取文件时,您可能会读取字符串。 To get them in double then do 让他们成双然后做

try {
    Double d1 = Double.parseDouble(str1);
    Double d2 = Double.parseDouble(str2);
    assertEquals(d1, d2, 0.15);
}catch (NumberFormatException e) {
    //not a number so cannot compare - perhaps call fail("fail msg here")
}

You listed junit in the tag. 您在标签中列出了junit。

Junit's .equals(double, double, accuracy) allows you to specify how close they have to be with the last parameter. Junit的.equals(double,double,precision)使您可以指定它们与最后一个参数的接近程度。

I'd just read in the values and call .equals for each in a test... 我只是读入值并在测试中为每个值调用.equals ...

or is there something to the question I'm not getting? 还是我没有得到解决的问题?

To parse the lines, your examples use spaces but you say "CSV" (Comma Separated). 为了解析这些行,您的示例使用空格,但您说“ CSV”(逗号分隔)。 If they actually are CSV you could use something like: 如果它们实际上是CSV,则可以使用以下方法:

String[] line = currentLine.split(",")

on each line. 在每一行上。 That would give you line[0]="item1", line[1]="5.12", line[2]="6.12" 那会给你line [0] =“ item1”,line [1] =“ 5.12”,line [2] =“ 6.12”

After that try parsing line[1] and line[2] with Double.parseDouble() 之后,尝试使用Double.parseDouble()解析line [1]和line [2]

By the way, use assertEquals, not assertTrue, the more specific assertEquals will display the value you wanted and the value you got as part of your error in the junit results. 顺便说一下,使用assertEquals而不是assertTrue,更具体的assertEquals将显示所需的值以及在junit结果中作为错误的一部分而获得的值。

I also recommend you pass in the optional string. 我还建议您传入可选字符串。 The test line would look like this: 测试行如下所示:

assertEquals("item "+file1.line[0]+" values do not match",
    Double.parseDouble(file1.line[1]),
    Double.parseDouble(file2.line[1]),
    0.001)

There is also the whole problem of making sure you are reading the same line for each file--getting them paired right. 还有整个问题,要确保您为每个文件读取相同的行-将它们正确配对。 If they are guaranteed to be in the same order you are fine, but if not you might want to hash up the first file by the name field: 如果可以保证它们的顺序相同,那么就可以了,但是如果没有保证,则可能要通过名称字段对第一个文件进行哈希处理:

for(String line: file1.readNextLine()) 
    file1hash.put(line.split(",")[0],line)

Then as you iterate through the second file you can easily do: 然后,当您遍历第二个文件时,您可以轻松地执行以下操作:

for(String line2: file2.readNextLine())  {
    String line1=file1hash.get(line2.split(",")[0])

to make sure line1 and line2 refer to the same line. 确保line1和line2引用同一行。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM