如何在列表中找到最常见的单词？

Question

我刚刚开始编码； 所以我不使用字典或集合或导入或比for / while循环和if语句更高级的东西

list1 = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"] 
list2 = ["cry", "cry", "cry", "no", "no", "no", "me", "me", "me"] 

def codedlist(number):
      max= 0
      for k in hello:
            if first.count(number) > max:
                    max= first.count(number)

Answer 1

您可以使用collections.Counter以单线查找它：

from collections import Counter

list1 = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"] 
Counter(list1).most_common()[-1]

输出：

('cry', 2)

（most_common（）返回按其计数排序的计数元素列表，最后一个元素[-1]是最小计数）

或者，如果可以包含几个最小元素，则稍微复杂一点：

from collections import Counter

list1 = [1,2,3,4,4,4,4,4]
counted = Counter(list1).most_common()
least_count = min(counted, key=lambda y: y[1])[1]
list(filter(lambda x: x[1] == least_count, counted))

输出：

[(1, 1), (2, 1), (3, 1)]

Answer 2

您可以使用collections.Counter对每个字符串的频率进行计数，然后使用min获取最小频率，然后使用list-comprehension获取具有最小频率的字符串：

from collections import Counter

def codedlist(number):
    c = Counter(number)
    m = min(c.values())
    return [s for s, i in c.items() if i == m]

print(codedlist(list1))
print(codedlist(list2))

输出：

['cry']
['cry', 'no', 'me']

Answer 3

from collections import OrderedDict, Counter def least_common(words): d = dict(Counter(words)) min_freq = min(d.values()) return [(k,v) for k,v in d.items() if v == min_freq] words = ["cry", "cry", "cry", "no", "no", "no", "me", "me", "me"] print(least_common(words))

Answer 4

一种简单的算法方法可以做到这一点：

def codedlist(my_list):
    least = 99999999 # A very high number
    word = ''
    for element in my_list:
        repeated = my_list.count(element)
        if repeated < least:
            least = repeated # This is just a counter
            word = element # This is the word
    return word

不过，它的表现不是很好。 有更好的方法可以做到这一点，但是我认为对于初学者来说这是一种简单的理解方法。

Answer 5

如果要所有单词按最小值排序：

import numpy as np

list1 = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"]
list2 = ["cry", "cry", "cry", "no", "no", "no", "me", "me", "me"]

uniques_values = np.unique(list1)

final_list = []
for i in range(0,len(uniques_values)):
    final_list.append((uniques_values[i], list1.count(uniques_values[i])))

def takeSecond(elem):
    return elem[1]

final_list.sort(key=takeSecond)

print(final_list)

对于列表1：

[（'cry'，2），（'no'，3），（'me'，4）]

对于list2：

[（'cry'，3），（'me'，3），（'no'，3）]

请谨慎使用代码，要更改列表，您必须在两点上编辑代码。

一些有用的解释：

numpy.unique为您提供非重复值
带有return elem [1]的 def takeSecond（elem）是一个允许您通过[1]列（第二个值）对数组进行排序的函数。

显示值或使所有项目按此条件排序可能很有用。

希望能帮助到你。

Answer 6

找到最小值通常与找到最大值相似。 您计算一个元素的出现次数，并且如果该计数小于计数器（对于最不常见的元素出现次数）：则替换该计数器。

这是一个粗略的解决方案，它占用大量内存，并且需要大量时间才能运行。 如果尝试缩短运行时间和内存使用量，您将了解更多列表（及其操作）。 我希望这有帮助！

list1 = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"]
list2 = ["cry", "cry", "cry", "no", "no", "no", "me", "me", "me"]

def codedlist(l):
    min = False #This is out counter
    indices = [] #This records the positions of the counts
    for i in range(0,len(l)):
        count = 0
        for x in l: #You can possibly shorten the run time here
            if(x == l[i]):
                count += 1
        if not min: #Also can be read as: If this is the first element.
            min = count
            indices = [i]
        elif min > count: #If this element is the least common
            min = count #Replace the counter
            indices = [i] # This is your only index
        elif min == count: #If this least common (but there were more element with the same count)
            indices.append(i) #Add it to our indices counter

    tempList = []
    #You can possibly shorten the run time below
    for ind in indices:
        tempList.append(l[ind])
    rList = []
    for x in tempList: #Remove duplicates in the list
        if x not in rList:
            rList.append(x)
    return rList

print(codedlist(list1))
print(codedlist(list2))

输出量

['cry']
['cry', 'no', 'me']

Answer 7

def codedlist(list):
    dict = {}
    for item in list:
        dict[item]=list.count(item)
    most_common_number = max(dict.values())
    most_common = []
    for k,v in dict.items():
        if most_common_number == v:
            most_common.append(k)
    return most_common
list1 = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"] 
list2 = ["cry", "cry", "cry", "no", "no", "no", "me", "me", "me"] 

print(codedlist(list1))

Answer 8

可能是最简单，最快的方法来接收馆藏中最不常见的物品。

min(list1, key=list1.count)

实际上：

>>> data = ["cry", "me", "me", "no", "me", "no", "no", "cry", "me"]
>>> min(data, key=data.count)
'cry'

测试了速度与collections.Counter方法的对比，速度更快。 看到这个REPL 。

PS： 用max可以找到最常见的物品。

编辑

要获得多个最不常见的项目，您可以使用理解来扩展此方法。

>>> lc = data.count(min(data, key=data.count))
>>> {i for i in data if data.count(i) == lc}
{'no', 'me', 'cry'}

Answer 9

基本上，您想浏览一下列表，然后在每个元素中问自己：

“我以前看过这个元素吗？”

如果答案为是，则将该元素的计数加1；如果答案为否，则将其添加至可见值字典。 最后我们按值对它进行排序，然后选择第一个单词，因为它是最小的，让我们实现它：

import operator

words = ['blah','blah','car']
seen_dictionary = {}
for w in words:
    if w in seen_dic.keys():
        seen_dictionary[w] += 1 
    else:
        seen_dic.update({w : 1})

final_word = sorted(x.items(), key=operator.itemgetter(1))[0][0] #as the output will be 2D tuple sorted by the second element in each of smaller tuples.

如何在列表中找到最常见的单词？

问题描述

9 个解决方案

解决方案1
2 2019-09-09 14:12:39

解决方案2
1 2019-09-09 14:11:43

解决方案3
1 2019-09-09 14:18:27

解决方案4
1 2019-09-09 14:28:14

解决方案5
1 2019-09-09 14:33:45

解决方案6
1 2019-09-09 14:35:46

解决方案7
1 2019-09-09 14:43:22

解决方案8
1 2019-09-09 14:43:45

解决方案9
1 2019-09-10 11:31:15

如何在列表中找到最常见的单词？

问题描述

9 个解决方案

解决方案1 2 2019-09-09 14:12:39

解决方案2 1 2019-09-09 14:11:43

解决方案3 1 2019-09-09 14:18:27

解决方案4 1 2019-09-09 14:28:14

解决方案5 1 2019-09-09 14:33:45

解决方案6 1 2019-09-09 14:35:46

解决方案7 1 2019-09-09 14:43:22

解决方案8 1 2019-09-09 14:43:45

解决方案9 1 2019-09-10 11:31:15

解决方案1
2 2019-09-09 14:12:39

解决方案2
1 2019-09-09 14:11:43

解决方案3
1 2019-09-09 14:18:27

解决方案4
1 2019-09-09 14:28:14

解决方案5
1 2019-09-09 14:33:45

解决方案6
1 2019-09-09 14:35:46

解决方案7
1 2019-09-09 14:43:22

解决方案8
1 2019-09-09 14:43:45

解决方案9
1 2019-09-10 11:31:15