Why is this method's time complexity 2*O(n log n) + O(m log m)?

Question

The code below tries to check if all the words in searchWords appear in newsPaperWords . Both lists can contain duplicates. If a word appears n times in searchWords , it'll have to appear at least n times in newsPaperWords for the method to return true. I thought that the time complexity was 2*O(n) + O(m) but the interviewer told me that it is 2*O(n log n) + O(m log m) .

/**
 * @param searchWords The words we're looking for. Can contain duplicates
 * @param newsPaperWords  The list to look into
 */
public boolean wordMatch(List<String> searchWords, List<String> newsPaperWords) {
    Map<String, Integer> searchWordCount = getWordCountMap(searchWords);
    Map<String, Integer> newspaperWordCount = getWordCountMap(newsPaperWords);
    for (Map.Entry<String, Integer> searchEntry : searchWordCount.entrySet()) {
        Integer occurrencesInNewspaper = newspaperWordCount.get(searchEntry.getKey());
        if (occurrencesInNewspaper == null || occurrencesInNewspaper < searchEntry.getValue()) {
            return false;
        }
    }
    return true;
}

private Map<String, Integer> getWordCountMap(List<String> words) {
    Map<String, Integer> result = new HashMap<>();
    for (String word : words) {
        Integer occurrencesThisWord = result.get(word);
        if (occurrencesThisWord == null) {
            result.put(word, 1);
        } else {
            result.put(word, occurrencesThisWord + 1);
        }
    }
    return result;
}

As I see it, the time complexity of the method is 2*O(n) + O(m) (being n the number of elements in searchWords and m the number of elements in newsPaperWords ):

The method getWordCountMap() has a complexity of O(n) , being n the number of elements in the given list. The method loops the list once, and assuming that the calls to result.get(word) and result.put() are O(1) .
Then, the iteration over searchWordCount.entrySet() is, worst-case, O(n) , assuming, again, that calls to Hashmap.get() are O(1) .

So, simply adding, O(n) + O(m) to build the two maps plus O(n) for the last look.

After reading this answer , taking O(n) as worst-case complexity for HashMap.get() , I could understand that the complexity of getWordCountMap() goes up to O(n*2n) and the final loop to O(n*n) , which would give a total complexity of O(n*2n) + O(m*2m) + O(n*n) .

But how is it 2*O(n log n) + O(m log m) ?

Answer 1

Due to JEP 180: Handle Frequent HashMap Collisions with Balanced Trees the worst case for HashMap.get() operation will be O(log n) . To quote the JEP 180:

The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).

This would make getWordCountMap() method O(n log n) .

Answer 2

The complexity you have deduced is correct assuming the hashmap is using a proper hash function. This algorithm looks like O(m + n) to me.

I guess that your interviewer described the complexity of another approach to solving this problem which is more time-consuming but ends up taking less space.

Why is this method's time complexity 2*O(n log n) + O(m log m)?

Question

2 answers

solution1
3 ACCPTED 2018-04-25 17:25:50

solution2
1 2018-04-25 17:27:33

Why is this method's time complexity 2*O(n log n) + O(m log m)?

Question

2 answers

solution1 3 ACCPTED 2018-04-25 17:25:50

solution2 1 2018-04-25 17:27:33

solution1
3 ACCPTED 2018-04-25 17:25:50

solution2
1 2018-04-25 17:27:33