简体   繁体   English

发现一个字符串是其他字符串的子字符串

[英]Find that one string is substring of other or not

  public class StringIsSubstring {


public static void main(String[] args) {
    String s1= new String("anurag");
    String s2=new String("anu");

    char a[]=s1.toCharArray();
    char b[]=s2.toCharArray();
    int i=0;
    int j=0;

    while(i<a.length && j<b.length)
    {
        if(a[i]==b[j])
        {
            i++;
            j++;
        }
        else
        {
            i++;
            j=0;
        }
        if(j == b.length)
        {
            System.out.println("we have found the substring");
        }
    }
}
 }

I have written following code to find that one String is substring of other or not. 我写了下面的代码,发现一个String是其他的子串或不是。 I dont want to use any library function. 我不想使用任何库函数。 Is there any more efficient way to do the same 有没有更有效的方法来做同样的事情

It is not possible to do any operations on a String without using a library function. 如果不使用库函数,则无法对String执行任何操作。 Your code uses String.toCharArray , for example. 例如,您的代码使用String.toCharArray And if you can use that, then you can also use String.indexOf and avoid reinventing the wheel. 如果你可以使用它,那么你也可以使用String.indexOf并避免重新发明轮子。

People have suggested Boyer-Moore. 人们建议Boyer-Moore。 This is a good choice if you are going to search a large body of text (in String instances or in some other representation). 如果要搜索大量文本(在String实例或其他表示形式中),这是一个不错的选择。 However, if you are only going to search a small chunk of text (as in your question), then the setup costs of Boyer-Moore mean that String.indexOf() will be faster. 但是,如果你只想搜索一小块文本(如你的问题),那么Boyer-Moore的设置成本意味着String.indexOf()会更快。 The same applies for other sophisticated algorithms. 这同样适用于其他复杂的算法。


So, the only way this question makes sense is if this is a homework exercise which includes a constraint on what you are allowed to use to solve the problem. 因此,这个问题唯一有意义的方法是,如果这是一项家庭作业,其中包括对您可以用来解决问题的限制。 In that case, unless you are doing an algorithmics course, I doubt that they expect you to research and implement a sophisticated algorithm. 在这种情况下,除非您正在进行算法课程,否则我怀疑他们希望您研究并实施复杂的算法。

You can see Boyer-Moore algorithm http://en.wikipedia.org/wiki/Boyer–Moore_string_search_algorithm and http://en.wikipedia.org/wiki/String_searching_algorithm . 您可以看到Boyer-Moore算法http://en.wikipedia.org/wiki/Boyer-Moore_string_search_algorithmhttp://en.wikipedia.org/wiki/String_searching_algorithm You can see also String.indexOf java implementation. 您还可以看到String.indexOf java实现。

Boyer-Moore has already been suggested, but let me also point out that your algorithm is actually broken. Boyer-Moore已被建议,但我还要指出你的算法实际上已被打破。 For example, if you want to test whether "coa" is a substring of "cocoa" (which is true), then you will match up to "co", then it will reset j on the next "c", but the problem is that now you have already "consumed" the "c" that starts the substring, and you don't get a match. 例如,如果你想测试“coa”是否是“cocoa”的子字符串(这是真的),那么你将匹配“co”,然后它会在下一个“c”上重置j,但是问题现在你已经“消耗”了启动子字符串的“c”,而你却没有得到匹配。

The previous comments have offered good reasons for applying a library function, however perhaps you are tasked with applying an alternate algorithm. 之前的注释为应用库函数提供了很好的理由,但是您可能需要应用替代算法。 From the sounds of your post, you are likely to be working with small s1s and s2s. 从你的帖子的声音,你可能会使用小s1s和s2s。 For this purpose, the KnuthMorrisPratt algorithm yields good efficiency. 为此,KnuthMorrisPratt算法可以提高效率。 You can implement this like so : 您可以这样实现:

public class SOStringDemo {

    public static void main(String[] args) {

        SOStringIsSubstring pair = new SOStringIsSubstring();

        pair.text = "thequickbrownfoxanujumpedoverthelazydogs";
        pair.pattern = "anu";

        pair.KMPMatch();

        return;
    }
}

And the class file : 和类文件:

public class SOStringIsSubstring {

    public String text;
    public String pattern;
    private char[] textArray;
    private char[] patternArray;
    private int[] prefix;

    public void KMPMatch() {

        textArray = text.toCharArray();
        patternArray = pattern.toCharArray();
        int n = textArray.length;
        int m = patternArray.length;

        ComputePrefixFunction();
        int q = 0;

        for(int i = 0; i < n; i++) {
            while((q > 0) && (patternArray[q]) != textArray[i])
                q = prefix[q];
            if(patternArray[q] == textArray[i])
                ++q;
            if(q == m) {
                System.out.println("SubString is at index " + (i - m + 2));
                q = prefix[q-1];
            }
        }

        return;
    }

    public void ComputePrefixFunction() {

        int m = patternArray.length;
        prefix = new int [m];
        int k = 0;
        for(int q = 1; q < m; q++) {
            while((k > 0) && (patternArray[k] != patternArray[q]))
                k = prefix[k-1];
            if(patternArray[k] == patternArray[q])
                ++k;
            prefix[q] = k;
        }

        return;
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM