简体   繁体   中英

Python count element occurrence of list1 in list2

In the following code, I want to count the occurrence of every word in word_list in test , the code below can do this job but it may not be efficient, is there any better way to do it?

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = [0] * len(word_list)
for i in range(len(word_list)):
    for w in test:
        if w == word_list[i]:
            result[i] += 1

print(result)

Use collections.Counter to count all the words in test in one go, then just get that count from the Counter for each word in word_list .

>>> word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
>>> test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]
>>> counts = collections.Counter(test)
>>> [counts[w] for w in word_list]
[1, 0, 3, 0, 0]

Or using a dictionary comprehention:

>>> {w: counts[w] for w in word_list}
{'perfect': 0, 'flawless': 0, 'good': 3, 'wonderful': 0, 'hello': 1}

Creating the counter should be O(n), and each lookup O(1), giving you O(n+m) for n words in test and m words in word_list .

You can do it in linear time using a dictionary.

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = []
word_map = {}
for w in test:
    if w in word_map:
        word_map[w] += 1
    else:
        word_map[w] = 1

for w in word_list:
    result.append(word_map.get(w, 0))

print(result)

You can combine collections.Counter and operator.itemgetter :

from collections import Counter
from operator import itemgetter

cnts = Counter(test)
word_cnts = dict(zip(word_list, itemgetter(*word_list)(cnts)))

Which gives:

>>> word_cnts
{'flawless': 0, 'good': 3, 'hello': 1, 'perfect': 0, 'wonderful': 0}

or if you rather want it as a list :

>>> list(zip(word_list, itemgetter(*word_list)(cnts)))
[('hello', 1), ('wonderful', 0), ('good', 3), ('flawless', 0), ('perfect', 0)]

You could try to use dictionnaries :

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = {}
for word in word_list:
    result[word]=0
for w in test:
    if result.has_key(w):
        result[w] += 1
print(result)

But you would end with a different structure. If you do not want that, you could try this instead

word_list = ["hello", "wonderful", "good", "flawless", "perfect"]
test = ["abc", "hello", "vbf", "good", "dfdfdf", "good", "good"]

result = {}
for w in test:
    if(result.has_key(w)):
        result[w] += 1
    else:
        result[w] = 1
count = [0] * len(word_list)
for i in range(len(word_list)):
    if (result.has_key(word_list[i])):
        count[i]=result[word_list[i]]
print(count)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM