简体   繁体   中英

Simple String matching in java

I am trying to implement simple String matching. The algorithm should return 1 if the source String contains the pattern String. I cannot understand why it is returning -1 on the following inputs

    String source = "aababba";
    String pattern = "abba";

Here is my implementation:

public static int findMatch(String source, String pattern)
{
    int j = 0,  pos = -1;
    boolean matched = false;

    if(source.length() < pattern.length())
        return -1;


    for(int i = 0; i < (source.length() - pattern.length()); i++)
    {
        if(source.charAt(i) == pattern.charAt(j))
            j++;
        else
            j = 0;
        if(j == pattern.length())
        {
            matched = true;
            break;
        }           
    }

    if(matched)
        return 1;       
    return -1;  
}   

EDIT: As many of you suggested, the culprit was the for loop. I should have made it as follows. The rest of the code is the same. Other solutions are also possible as shown in the answers.

for(int i = 0; i <= (source.length() - pattern.length()); i++)
    {
        if(source.charAt(i+j) == pattern.charAt(j))
        {

You are only checking three characters

source.length() - pattern.length()

So j will never be equal to pattern.length . You need to check the entire source .

for(int i = 0; i < source.length(); i++)
{
  if(source.charAt(i) == pattern.charAt(j))
  ...
}

Regarding the Solution

for(int i = 0; i <= (source.length() - pattern.length()); i++)
    {
        if(source.charAt(i+j) == pattern.charAt(j))
        {

Why have an additional subtraction and addition plus more complicated code when you can just step through the length of source?

The source of an issue is that you've limited the upper bound of i variable by (source.length() - pattern.length()) . The for loop you've written can't check all the characters of source string, thus it'll return -1 even for some pairs of strings where it's possible to find a match.
Solution: rewrite the for loop like so:

for(int i = 0; i < source.length(); i++)

源的长度为7。模式的长度为4。您的语句i < (source.length() - pattern.length()对于i> 2而言将不成立,因此循环不会足够“远”地运行。

您可以轻松地使用String.contains方法检查字符串是否包含模式。

I think you already know the reason why it return wrong. You can try this:

j = 0;
while(j <= (source.length() - pattern.length())){

    for (i = pattern.length(); i >= 0 && pattern[i] == source[i + j]; --i);

    if (i < 0)) {
        matched = true;
        break;
    }
    else j++;
}

about string matching, there are many methods, you can use KMP, BM and Sunday and so on, and performance comparation is: KMP < BM < Sunday, you can try these and they are very userful.

By using this statement (source.length() - pattern.length()) you are looping on the string only on the three first chars of the source. So I will never find the entire pattern.

Try using pattern.length() instead, or better try compute the minimal value of both pattern.length() and source.length()

I'll assume it should return 1 because "aababba" contains "abba" at the end.

This, from your edit (solution) will return 1 :

for(int i = 0; i <= (source.length() - pattern.length()); i++) {
    if(source.charAt(i+j) == pattern.charAt(j)) {

But from my assumption, you should be checking through every letter in source so I would use:

for(int i = 0; i < source.length(); i++) {
    if(source.charAt(i) == pattern.charAt(j)) {

Because why ignore the last bunch of characters?

But maybe I'm just assuming wrong what you want your algorithm to do because from what I've assumed, you could just use:

if (source.contains(pattern))
    return 1;
else
    return -1;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM