簡體   English   中英

在Python中計算文本文件中單詞的頻率

[英]Count frequency of word in text file in Python

我試圖弄清楚如何制作一個程序來獲取用戶選擇的文件(通過輸入文件名)並計算用戶輸入的每個單詞的頻率。

我擁有大部分信息,但是當我輸入多個單詞供程序查找時,只有第一個單詞顯示正確的頻率,其余的顯示為“ 0次”

file_name = input("What file would you like to open? ")
f = open(file_name, "r")
the_full_text = f.read()
words = the_full_text.split()
search_word = input("What words do you want to find? ").split(",")
len_list = len(search_word) 

word_number = 0
print()
print ('... analyzing ... hold on ...')
print()
print ('Frequency of word usage within', file_name+":")
for i in range(len_list):

    frequency = 0
    for word in words:
        word = word.strip(",.")
        if search_word[word_number].lower() == word.lower():
            frequency += 1
    print ("   ",format(search_word[word_number].strip(),'<20s'),"/", frequency, "occurrences")
    word_number = word_number + 1

像一個示例輸出將是:

What file would you like to open? assignment_8.txt
What words do you want to find? wey, rights, dem

... analyzing ... hold on ...

Frequency of word usage within assignment_8.txt:
    wey                  / 96 occurrences
    rights               / 0 occurrences
    dem                  / 0 occurrences

我的程序怎么了? 請幫忙:o

您需要從搜索詞中刪除空格。

但是,您當前的算法效率很低,因為它必須為每個搜索詞重新掃描整個文本。 這是一種更有效的方法。 首先,我們清理搜索詞並將其放入列表中。 然后,我們從該列表中構建字典,以在文本文件中找到這些單詞時存儲每個單詞的計數。

file_name = input("What file would you like to open? ")
with open(file_name, "r") as f:
    words = f.read().split()

search_words = input("What words do you want to find? ").split(',')
search_words = [word.strip().lower() for word in search_words]
#print(search_words)
search_counts = dict.fromkeys(search_words, 0)

print ('\n... analyzing ... hold on ...')
for word in words:
    word = word.rstrip(",.").lower()
    if word in search_counts:
        search_counts[word] += 1

print ('\nFrequency of word usage within', file_name + ":")
for word in search_words:
    print("   {:<20s} / {} occurrences".format(word, search_counts[word]))

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM