简体   繁体   中英

Big-O analysis of a sub-string of string algorithm. Does O(n*logn) simplify?

Hello and thanks for reading. I am working on a homework assignment. I have created a method that compares to strings to see if one is a sub-string of another. I am aware there is already a built in method to do this. The assignment does not allow me to use them. Anyway, the below code is working.

I have to analyze the complexity of the algorithm using Big-O notation. From what I can see the outer loop runs in linear time because it simply runs as many times as the length of the string. Thus: O(n)

The inner loop is different, it may or may not happen, and if it does it may finish before the the length of the second string which is it's input. Thus: O(logn)

So it appears to me the the complexity is O (n*logn). Does this simplify to O(n) or does it stay in its current form? Or am I wrong about the inner loop being O(logn)?

import java.util.Scanner;

public class HW3q6 {
public static void main(String[] args) {
    Scanner userInput = new Scanner( System.in );
    System.out.println("Please enter the first string: ");
    char[] charArray1 = userInput.nextLine().toUpperCase().toCharArray();
    System.out.println("Please enter the second string: ");
    char[] charArray2 = userInput.nextLine().toUpperCase().toCharArray();

    System.out.println("The second string is a substring of the first: " + subString(charArray1, charArray2));

}

private static boolean subString(char[] charArray1, char[] charArray2) {
    int counter = 0;
    for (int i = 0; i < (charArray1.length + 1) - charArray2.length ; i++) {
        if (charArray1[i] == charArray2[0]) {
            for (int n = 0; n < charArray2.length; n++) {
                if (charArray1[i+n] == charArray2[n]) {
                    counter++;
                }
            }
            if (counter == charArray2.length) {
                return true;
            } else
                counter = 0;
        }
    }
    return false;

}
}

The inner loop is not logN. Big O measures the worst case complexity. As I understand from your code,the inner loop can run N (length of string 2) times. Explanation follows:

Suppose you have two strings say aaaaaaaa and aaac, the outer loop will match first char, you will go to the inner loop, check every character, then realize a false. Then outer loop will again match the beginning of the 2nd string with second character of first string, then you will again check all characters in second string realize a false. This will go for M characters of the first string, as well as N characters of the second string, yielding a O(MN) algo, which is O(N^2) considering M=c*N where c is a constant.

Hope this helps.

I see you already accepted an answer. But your time complexity is actually O(m*(n-m+1)) where m is the smaller text and n is the larger text. To see that this distinction matters just pick a few numbers for m and n and compute. If the argument is that when it comes to Big-O notation there is not much difference between O(mn) and O(m*(n-m+1)) since O(mn) > O(m*(n-m+1)) , then you could say the complexity is O(n^2) since O(n^2) > O(mn) > O(m*(n-m+1)) . Also your code is not doing proper error checkings, see Geekviewpoint.com on string matching for some hints.

O(NlogN) is not equivalent to O(N)

However, your reasoning that the inner loop is O(logN) is faulty. The fact that it "it may or may not happen" does not necessarily make it O(logN) . For instance, if the inner loop happened "roughly half the time on average" then the contribution is likely to be C * 1/2 N ; ie O(N) .

I don't have time to analyse the code in fine detail right right now, but it seems that you need to look at the best, average and worst case complexity.

(For instance, the classic form of the quicksort algorithm which is O(NlogN) on average but has a worst case complexity of O(N^2) .)

The inner loop still, as mellamokb pointed out, runs in linear time. Therefore your algorithm as a whole will be O(mn), where m and n are the lengths of the two strings.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM