Optimizing counting occurences of a list of words in a given string (Python)

Question

I am creating a function that counts the occurrences of searched_words in a passed string. The result is a dictionary with the matching words as keys and their occurrences as values.

I have already created a function that accomplishes this but it is very poorly optimized.

def get_words(string, searched_words):
    words = string.split()

    # O(nm) where n is length of words and m is length of searched_words
    found_words = [x for x in words if x in searched_words]

    # O(n^2) where n is length of found_words
    words_dict = {}
    for word in found_words:
        words_dict[word] = found_words.count(word)

    return words_dict


print(get_words('pizza pizza is very cool cool cool', ['cool', 'pizza']))
# Results in {'pizza': 2, 'cool': 3}

I have attempted to use the Counter functionality from Python's collections model but cannot seem to reproduce the desired output. It seems using the set datatype may also solve my optimization problem but I am unsure of how to count word occurrences while using sets.

Answer 1

You're right in thinking that there is a good solution using the Counter :

from collections import Counter

string = 'pizza pizza is very cool cool cool'
search_words = ['cool', 'pizza']
word_counts = Counter(string.split())

# If you want to get a dict only containing the counts of words in search_words:
search_word_counts = {wrd: word_counts[wrd] for wrd in search_words}

Answer 2

Alternatively, you can create a list comprehension of counts and then produce a dictionary out of zip :

def get_words(string, searched_words):
    wordlist = string.split()
    wordfreq = [wordlist.count(p) for p in searched_words]
    return dict(list(zip(searched_words, wordfreq)))

That's shorter and takes away extra for loop and no need for extra imports, yet it takes applying dict to list to zip .

Optimizing counting occurences of a list of words in a given string (Python)

Question

2 answers

solution1
1 ACCPTED 2020-12-11 04:09:33

solution2
1 2020-12-11 04:14:54

Optimizing counting occurences of a list of words in a given string (Python)

Question

2 answers

solution1 1 ACCPTED 2020-12-11 04:09:33

solution2 1 2020-12-11 04:14:54

solution1
1 ACCPTED 2020-12-11 04:09:33

solution2
1 2020-12-11 04:14:54