I'm writing a python function that consumes a list of strings and produces a list of the most frequently occurring items.
For example:
>>> trending(["banana", "trouble", "StarWars", "StarWars", "banana", "chicken", "BANANA"])
["banana", "StarWars"]
but
>>> trending(["banana", "trouble", "StarWars", "Starwars", "banana", "chicken"])
["banana"]
So far, I've written a function that produces only the first word that appears frequently instead of a list of words that appear frequently. Also, my list contains the index of that one frequent item.
def trending(slst):
words = {}
for word in slst:
if word not in words:
words[word] = 0
words[word] += 1
return words
How can I fix this function to produce a list of the most frequently occurring items (instead of the first of the most frequently occurring items) and how do I remove the index?
Without the use of Counter
you can make your own counter with a dict
and extract frequent items:
def trending(slst):
count = {}
items = []
for item in set(slst):
count[item] = slst.count(item)
for k, v in count.items():
if v == max(count.values()):
items.append(k)
return items
Use a Counter
:
In [1]: from collections import Counter
In [2]: l = ["banana", "trouble", "StarWars", "StarWars", "banana", "chicken", "BANANA"]
In [3]: Counter(l)
Out[3]: Counter({'StarWars': 2, 'banana': 2, 'BANANA': 1, 'trouble': 1, 'chicken': 1})
With Counter(l).most_common(n)
you can get the n
most common items.
Your trending()
function is basically what the Counter
does as well. After counting the word occurrences, you can get the maximum number of occurrences using max(words.values())
. This can be used for filtering your word list:
def trending(slst):
...
max_occ = max(words.values())
return [word for word, occ in words.items() if occ == max_occ]
The following solution uses only lists. No dictionary
, set
or other Python collection is used:
def trending(words):
lcounts = [(words.count(word), word) for word in words]
lcounts.sort(reverse=True)
ltrending = []
for count, word in lcounts:
if count == lcounts[0][0]:
if word not in ltrending:
ltrending.append(word)
else:
break
return ltrending
ltests = [
["banana", "trouble", "StarWars", "StarWars", "banana", "chicken", "BANANA"],
["banana", "trouble", "StarWars", "Starwars", "banana", "chicken"]]
for test in ltests:
print trending(test)
It gives the following output:
['banana', 'StarWars']
['banana']
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.