简体   繁体   English

找到长度介于x和y之间的最长重复子字符串

[英]Find longest repeating substring with length between x and y

Given a string : "blablafblafbla" and 2 limits : x=3, y=5 I want to find the longest repeating substring that has the length between x and y.If there are many, the first one In my example that would be "blaf" Several questions: 1. is it easier to use regex? 给定一个字符串:“blablafblafbla”和2个限制:x = 3,y = 5我想找到长度在x和y之间的最长重复子字符串。如果有很多,第一个在我的例子中将是“ blaf“几个问题:1。使用正则表达式更容易吗? 2.I know how to find the longest substring but where do i have to put the conditions for it to be between x and y? 2.我知道如何找到最长的子串但是我必须在哪里设置它在x和y之间的条件?

public static String longestDuplicate(String text)
{
    String longest = "";
    for (int i = 0; i < text.length() - 2 * longest.length() * 2; i++)
    {
        OUTER: for (int j = longest.length() + 1; j * 2 < text.length() - i; j++)
        {
            String find = text.substring(i, i + j);
            for (int k = i + j; k <= text.length() - j; k++)
            {
                if (text.substring(k, k + j).equals(find))
                {
                    longest = find;
                    continue OUTER;
                }
            }
            break;
        }
    }
    return longest;
}

The code you provide is an extremely inefficient way to solve the problem you have. 您提供的代码是一种解决问题的极其低效的方法。 I would implement the solution using Rabin-Karp or some other rolling hash algorithm and this will enable you to solve your problem with complexity O((yx) * L) . 我将使用Rabin-Karp或其他一些滚动哈希算法实现解决方案,这将使您能够用复杂度O((yx) * L)解决您的问题。

You can't use regular expressions here- they are meant to solve copletely different tasks. 你不能在这里使用正则表达式 - 它们旨在解决完全不同的任务。

As for your question on how to use your solution to find longest substring with length between x and y , simply modify the loop over j to only consider values that are in the interval [x, y] . 至于如何使用解决方案查找长度介于xy之间的最长子串的问题,只需修改j上的循环,只考虑区间[x, y] Here is how you can do that. 这是你如何做到这一点。

for (int j = Math.max(longest.length() + 1, x) ; j * 2 < text.length() - i && j < y; j++)

EDIT: to find the longest substring, reverse the for cycle: 编辑:找到最长的子串,反转for循环:

for (int j = Math.min((text.length() - i -1)/2, y) ; j > longest.length() && j >=x; j--) 
public static int commonPrefix (String string, int x, int y)
{
    int l = string.length ();
    int n = 0;
    int oy = y;
    while (x < oy && y < l && string.charAt (x) == string.charAt (y))
    {
        n++; x++; y++;
    }
    return n;
}

public static String longestRepeatingSubstring (
    String string, int minLength, int maxLength)
{
    String found = null; 

    int l = string.length ();
    int fl = minLength; 
    for (int x = 0; x < l - fl * 2; x++)
        for (int y = x + 1; y < l - fl; y++)
        {
            int n = commonPrefix(string, x, y);

            if (n >= maxLength)
                return string.substring(x, x + maxLength);

            if (n > fl)
            {
                found = string.substring (x, x + n);
                fl = n;
            }
        }

    return found;
}

public static void main(String[] args) {
    System.out.println (longestRepeatingSubstring ("blablafblafblafblaf", 3, 5));
}

Here is a clunky implementation with regex: 这是一个使用正则表达式的笨重实现:

//import java.util.regex.*;

public static String longestRepeatingSubstring (String string, int min, int max)
{
  for (int i=max; i>=min; i--){
    for (int j=0; j<string.length()-i+1; j++){

      String substr = string.substring(j,j+i);
      Pattern pattern = Pattern.compile(substr);
      Matcher matcher = pattern.matcher(string);

      int count = 0;
      while (matcher.find()) count++;

      if (count > 1) return substr;
    }
  }

  return null;
}

public static void main(String[] args) {
  System.out.println (longestRepeatingSubstring ("blablafblafbla", 3, 5));
}
    public static int getCount(String string , String subString){

    int count = 0;
    int fromIndex = 0;
    do{
    if(string.indexOf(subString, fromIndex) != -1){
        count++;
        fromIndex = string.indexOf(subString, fromIndex);
    }
    }while(fromIndex == string.length()-1);
    return count;
}
public static String longestRepeatingSubstring (int min,int max , String string){
    Vector substrs = new Vector();
    Vector substrs_length = new Vector();
    for (int i=min; i<=max; i++){
        for (int j=0; j<string.length()-i+1; j++){
            String substr=string.substring(j, i+j);
            System.out.println(substr);
            if (substrs.indexOf(substr) == -1){
                int count =getCount(string, substr);
                if (count != 0) {
                    substrs.addElement(substr);
                    substrs_length.addElement(count);
                }
            }
        }
    }
    int maxLength = 0;
    int index = -1;
    for(int i = 0 ;  i < substrs_length.size() ; i++){
        int length = (int) substrs_length.elementAt(i);
        if(length > maxLength){
            maxLength = length;
            index = i;
        }
    }
    return (String) substrs.elementAt(index);
}
public static void main(String [] arg){
    System.out.print(longestRepeatingSubstring(3, 5, "blablafblafbla"));
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM