KMP DFA 重启状态

Question

I am referring "Algorithms fourth Edition by Sedgewick & Wyane" String matching Chapter 5 .我指的是“Sedgewick & Wyane 的算法第四版”字符串匹配第 5 章。

The given algorithm is KMP substring search in which it build a DFA from pattern state .给定的算法是 KMP 子字符串搜索，它从模式状态构建 DFA。 I understand the algorithm for building the DFA , code is as follows :我了解构建 DFA 的算法，代码如下：

public KMP(String pat) {
        this.R = 256;
        this.pat = pat;

        // build DFA from pattern
        int m = pat.length();
        dfa = new int[R][m]; 
        dfa[pat.charAt(0)][0] = 1; 
        for (int x = 0, j = 1; j < m; j++) {
            for (int c = 0; c < R; c++) 
                dfa[c][j] = dfa[c][x];     // Copy mismatch cases. 
            dfa[pat.charAt(j)][j] = j+1;   // Set match case. 
            x = dfa[pat.charAt(j)][x];     // Update restart state. 
        } 
    }

I am not able to get the following line : x = dfa[pat.charAt(j)][x]; // Update restart state.我无法得到以下行： x = dfa[pat.charAt(j)][x]; // Update restart state. x = dfa[pat.charAt(j)][x]; // Update restart state.

I understand that this value is achieved by feeding the pat[1..j-1] in partial build DFA but not able to get that the code,how it is achieving this.我知道这个值是通过在部分构建 DFA 中提供 pat[1..j-1] 来实现的，但无法获得代码，它是如何实现的。

I also understand that x is the length of longest prefix of pattern that the also suffix.我也明白 x 是也是后缀的模式的最长前缀的长度。

I have seen many other related question but those are related to understand the algorithm itself.我看过许多其他相关问题，但这些问题与理解算法本身有关。

I need to understand that how x = dfa[pat.charAt(j)][x]; // Update restart state.我需要了解x = dfa[pat.charAt(j)][x]; // Update restart state. x = dfa[pat.charAt(j)][x]; // Update restart state. simulating the restart state .模拟重启状态。

Answer 1

If we look carefully, X is initialized to state 0, and J is to state 1如果我们仔细观察，X 被初始化为状态 0，而 J 被初始化为状态 1

Now, we just keep moving both forward based on next character visited, and since X is behind J he already knows which state is next, by default ALL ARE POINTING BACK TO 0 so that line will always maintain the prefix, if any otherwise restart at 0现在，我们只是根据访问的下一个字符继续向前移动，并且由于 X 在 J 后面，他已经知道下一个状态，默认情况下所有都指向 0以便该行将始终保持前缀，如果有的话，否则重新启动0

dfa[c][j] = dfa[c][x]; // Copy mismatch cases. This line is just creating failure or back pointers这一行只是创建failure或back pointers

x = dfa[pat.charAt(j)][x]; // Update restart state. And this line is moving the prefix ahead, to stay in sync with J, so it always point to a place where prefix == suffix并且这一行将前缀向前移动，以与 J 保持同步，因此它始终指向 prefix == suffix 的位置

perhaps this would help further https://labuladong.gitbook.io/algo-en/i.-dynamic-programming/kmpcharactermatchingalgorithmindynamicprogramming也许这会有助于进一步https://labuladong.gitbook.io/algo-en/i.-dynamic-programming/kmpcharactermatchingalgorithmindynamicprogramming

Answer 2

First, you should know the meaning of X:首先，你应该知道X的含义：

before we update it, it means the state(how many characters are successfully matched) we'll go to from current state(j characters matched)在我们更新它之前，这意味着我们将从当前状态（匹配 j 个字符）转到的状态（成功匹配多少个字符）
after we update it, it means the state we'll go to from next state(j + 1 characters matched)在我们更新它之后，这意味着我们将从下一个状态转到的状态（匹配 j + 1 个字符）

Then然后

The update of X is caused by the successful matching of the txt[i] and pat[j], attention, what state they need to be match successfully (state determines the x , the character need here determines the pat.charAt(j) of the x = dfa[pat.charAt(j)][x]) , in the state that the first match fails, the state becomce the origin X , because we need to match the txt[i + 1] instead of txt[i] in the next loop in search() X 的更新是由 txt[i] 和 pat[j]匹配成功引起的，注意，他们需要什么状态才能匹配成功（状态决定了x ，这里需要的字符决定了pat.charAt(j)的 x = dfa[pat.charAt(j)][x]) ，在第一次匹配失败的状态下，状态成为原点 X ，因为我们需要匹配 txt[i + 1] 而不是 txt[ i] 在 search() 的下一个循环中

KMP DFA 重启状态

问题描述

2 个解决方案

解决方案1
0 2021-06-25 05:42:13

解决方案2
0 2021-10-17 10:36:33

KMP DFA 重启状态

问题描述

2 个解决方案

解决方案1 0 2021-06-25 05:42:13

解决方案2 0 2021-10-17 10:36:33

解决方案1
0 2021-06-25 05:42:13

解决方案2
0 2021-10-17 10:36:33