查找數據集中出現頻率最高的單詞

Question

我寫了一個 function 作為輸入列表並返回列表中最常見的項目。

##Write the function
def most_frequent(List): 
    dict = {} 
    count, itm = 0, '' 
    for item in reversed(List): 
        dict[item] = dict.get(item, 0) + 1
        if dict[item] >= count : 
            count, itm = dict[item], item 
    return(item) 
  
    return num 

# verfiy the code 

list = [5,42,34,6,7,4,2,5]
print(most_frequent(list))

然后下載兩個文本文件以獲取最常用的單詞。

# Download the files restaurants.txt and restaurant-names.txt from Github
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt
!curl https://raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurants.txt -o restaurants.txt



# create the list from the restaurants.txt
  List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

print(most_common(List))

但是當我試圖找到出現在餐廳名稱中最常見的詞時。 我得到了同樣的結果。 你能幫忙檢查一下這是否正確嗎？ 謝謝

 # create the list from the restaurants.txt
List = open("restaurants.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

Answer 1

它是return itm （最常見的項目）而不是return item （你的反向列表的最后一部分）

Answer 2

似乎您可能為餐廳名稱文件使用了錯誤的文件名。 從您的 curl 命令來看：

:curl https.//raw.githubusercontent.com/ipeirotis/introduction-to-python/master/data/restaurant-names.txt -o restaurant-names.txt

你應該使用的文件名是restaurant-names.txt所以你的代碼應該是：

 # create the list from the restaurants.txt
List = open("restaurants-names.txt").readlines()

# get the most most frequent restaurant names
print("The most frequent restaurant names is ",most_frequent(List))

Answer 3

可能是 function 出錯了，如果您嘗試相同的測試數據但順序不同，例如： list = [42,5,34,6,5,7,4,2]而不是list = [5,42,34,6,7,4,2,5] ，output還是5嗎？

查找數據集中出現頻率最高的單詞

問題描述

3 個解決方案

解決方案1
1 2020-07-22 20:36:27

解決方案2
0 2020-07-22 19:29:33

解決方案3
0 2020-07-22 19:32:41

查找數據集中出現頻率最高的單詞

問題描述

3 個解決方案

解決方案1 1 2020-07-22 20:36:27

解決方案2 0 2020-07-22 19:29:33

解決方案3 0 2020-07-22 19:32:41

解決方案1
1 2020-07-22 20:36:27

解決方案2
0 2020-07-22 19:29:33

解決方案3
0 2020-07-22 19:32:41