简体   繁体   English

Java中的Levenshtein距离输出错误的数字

[英]Levenshtein distance in java outputting the wrong number

For my university assignment in java I have been asked to provide "extra analytics functions" I decided to use Levenshtein distance but I have an issue where the number outputted to the console is one less than the actual answer. 对于我在Java中的大学作业,我被要求提供“额外的分析功能”,因此我决定使用Levenshtein距离,但问题是输出到控制台的数字比实际答案少一个。 So the distance between "cat" and "hat" should be 1 but it's displaying as 0 因此,“ cat”和“ hat”之间的距离应为1,但显示为0

public class Levenshtein {

public Levenshtein(String first, String second) {

    char [] s = first.toCharArray();
    char [] t = second  .toCharArray();
    int Subcost = 0;

    int[][] array = new int[first.length()][second.length()];

    for (int i = 0; i < array[0].length; i++)
    {
        array[0][i] = i;
    }

    for (int j = 0; j < array.length; j++)
    {

        array [j][0]= j;
    }

    for (int i = 1; i < second.length(); i++)
    {
        for (int j = 1; j < first.length(); j++)
        {
            if (s[j] == t [i])
            {
                Subcost = 0;
            }
            else
            {
                Subcost = 1;
            }

            array [j][i] = Math.min(array [j-1][i] +1,
                    Math.min(array [j][i-1] +1,
                            array [j-1][i-1] + Subcost) );
        }
    }

    UI.output("The Levenshtein distance is -> " + array[first.length()-1][second.length()-1]);

}

} }

Apparently you're using the following algorithm: 显然您正在使用以下算法:

https://en.wikipedia.org/wiki/Levenshtein_distance#Iterative_with_full_matrix https://zh.wikipedia.org/wiki/Levenshtein_distance#Iterative_with_full_matrix

I think you were not too accurate with indices. 我认为您对索引不太准确。 I'm not sure where exactly the problem is, but here is a working version : 我不确定问题出在哪里,但这是一个工作版本

public int calculateLevenshteinDistance(String first, String second) {

    char[] s = first.toCharArray();
    char[] t = second.toCharArray();
    int substitutionCost = 0;

    int m = first.length();
    int n = second.length();

    int[][] array = new int[m + 1][n + 1];

    for (int i = 1; i <= m; i++) {
        array[i][0] = i;
    }

    for (int j = 1; j <= n; j++) {

        array[0][j] = j;
    }

    for (int j = 1; j <= n; j++) {
        for (int i = 1; i <= m; i++) {
            if (s[i - 1] == t[j - 1]) {
                substitutionCost = 0;
            } else {
                substitutionCost = 1;
            }

            int deletion = array[i - 1][j] + 1;
            int insertion = array[i][j - 1] + 1;
            int substitution = array[i - 1][j - 1] + substitutionCost;
            int cost = Math.min(
                    deletion,
                    Math.min(
                            insertion,
                            substitution));
            array[i][j] = cost;
        }
    }

    return array[m][n];
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM