查找数据集中出现频率最高的单词

Question

I write a function that takes as input a list and returns the most common item in the list.我写了一个 function 作为输入列表并返回列表中最常见的项目。

##Write the function
def most_frequent(List): 
    dict = {} 
    count, itm = 0, '' 
    for item in reversed(List): 
        dict[item] = dict.get(item, 0) + 1
        if dict[item] >= count : 
            count, itm = dict[item], item 
    return(item) 
  
    return num 

# verfiy the code 

list = [5,42,34,6,7,4,2,5]
print(most_frequent(list))

and then download two text file to get the most frequent words.然后下载两个文本文件以获取最常用的单词。

# Download the files restaurants.txt and restaurant-names.txt from Github
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurants.txt -o restaurants.txt



# create the list from the restaurants.txt
  List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

print(most_common(List))

but when i try to find the most frequent words that appear in the restaurant names.但是当我试图找到出现在餐厅名称中最常见的词时。 I got the same result.我得到了同样的结果。 Could you help to check whether this is correct or not?你能帮忙检查一下这是否正确吗？ Thanks谢谢

 # create the list from the restaurants.txt
List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

Answer 1

It's return itm (most common item) instead of return item (last part of your reversed list)它是return itm （最常见的项目）而不是return item （你的反向列表的最后一部分）

Answer 2

It seems as though you might be using the wrong filename for the restauarant names file.似乎您可能为餐厅名称文件使用了错误的文件名。 Judging from your curl command:从您的 curl 命令来看：

:curl https.//raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt

The filename you should be using is restaurant-names.txt so your code should be:你应该使用的文件名是restaurant-names.txt所以你的代码应该是：

 # create the list from the restaurants.txt
List = open("restaurants-names.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

Answer 3

It might be the function that is wrong, what if you try the same test data but in a different order, for example: list = [42,5,34,6,5,7,4,2] instead of list = [5,42,34,6,7,4,2,5] , is the output still 5?可能是 function 出错了，如果您尝试相同的测试数据但顺序不同，例如： list = [42,5,34,6,5,7,4,2]而不是list = [5,42,34,6,7,4,2,5] ，output还是5吗？

查找数据集中出现频率最高的单词

问题描述

3 个解决方案

解决方案1
1 2020-07-22 20:36:27

解决方案2
0 2020-07-22 19:29:33

解决方案3
0 2020-07-22 19:32:41

查找数据集中出现频率最高的单词

问题描述

3 个解决方案

解决方案1 1 2020-07-22 20:36:27

解决方案2 0 2020-07-22 19:29:33

解决方案3 0 2020-07-22 19:32:41

解决方案1
1 2020-07-22 20:36:27

解决方案2
0 2020-07-22 19:29:33

解决方案3
0 2020-07-22 19:32:41