Rabin Karp算法-給定輸入的最壞情況O（m * n）如何？

Question

在RK算法的頂級編碼器代碼中：

// correctly calculates a mod b even if a < 0
function int_mod(int a, int b)
{
  return (a % b + b) % b;
}

function Rabin_Karp(text[], pattern[])
{
  // let n be the size of the text, m the size of the
  // pattern, B - the base of the numeral system,
  // and M - a big enough prime number

  if(n < m) return; // no match is possible

  // calculate the hash value of the pattern
  hp = 0;
  for(i = 0; i < m; i++) 
    hp = int_mod(hp * B + pattern[i], M);

  // calculate the hash value of the first segment 
  // of the text of length m
  ht = 0;
  for(i = 0; i < m; i++) 
    ht = int_mod(ht * B + text[i], M);

  if(ht == hp) check character by character if the first
               segment of the text matches the pattern;

  // start the "rolling hash" - for every next character in
  // the text calculate the hash value of the new segment
  // of length m; E = (Bm-1) modulo M            
  for(i = m; i < n; i++) {
    ht = int_mod(ht - int_mod(text[i - m] * E, M), M);
    ht = int_mod(ht * B, M);
    ht = int_mod(ht + text[i], M);

    if(ht == hp) check character by character if the
                 current segment of the text matches
                 the pattern; 
  }
}

據記載

不幸的是，在某些情況下，我們仍然必須為文本中的每個起始位置運行“幼稚”方法的整個內部循環-例如，在字符串“ aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa”中搜索模式“ aaa”時，最壞的情況下，我們仍然需要（n * m）次迭代。

但是算法不會在第一次迭代時停止嗎？就像當它看到前三個字母是“ a”並且與指針匹配時一樣嗎？

Answer 1

假設我們要搜索的字符串不是“ aaa”，而是其他一些哈希與“ aaa”的哈希相同的字符串。 然后將需要在每個點進行比較。

當然，我們希望比較失敗的時間早於m字符，但可能需要o（m）個字符。

話雖如此，RK的常見用法是查找所有（重疊）實例，在這種情況下，引用的示例顯然是o（mn）。

Answer 2

Rabin-Karp算法會繼續計算大小為M的所有text子字符串的哈希值，並將其與pattern的哈希值相匹配。 現在，可以有多個具有相同哈希值的子字符串。

所以當的散列值pattern和一些字符串text的比賽，我們需要再次遍歷他們只是為了確保他們是否實際上是一樣的。

在pattern = "AAA"和text = "AAAAAAAAAAAAA" ，有O(n)個子字符串與pattern的哈希值匹配。 對於每場比賽，我們都需要迭代以確認O(m)時間； 因此，最壞情況下的復雜度O(n*m) 。

Rabin Karp算法-給定輸入的最壞情況O（m * n）如何？

問題描述

2 個解決方案

解決方案1
1 2016-08-20 15:03:59

解決方案2
1 2018-03-26 16:44:19

Rabin Karp算法-給定輸入的最壞情況O（m * n）如何？

問題描述

2 個解決方案

解決方案1 1 2016-08-20 15:03:59

解決方案2 1 2018-03-26 16:44:19

解決方案1
1 2016-08-20 15:03:59

解決方案2
1 2018-03-26 16:44:19