計算 Python 中的源頻率

Question

我是 python 的新手； 我正在尋找計算源頻率。 我有文件（來源在標記中），我想找到所有來源中顯示的單詞來計算。 例如，顯示來源的單詞“beautiful”，結果單詞“beautiful”在 5 個來源中。 我已經有 python 代碼來查找一個單詞，但我需要從文件中查找所有單詞，我應該如何更改代碼任何想法？

from os import listdir

with open("C:/Users/elle/Desktop/Archivess/test/rez.txt", "w") as f:
    for filename in listdir("C:/Users/elle/Desktop/Archivess/test/sources/books/"):
        with open('C:/Users/elle/Desktop/Archivess/test/freqs/books/' + filename) as currentFile:
            text = currentFile.read()

            if ('beautiful' in text):
                f.write('The word excist in the file ' + filename[:-4] + '\n')
            else:
                f.write('The word doen't excist in the file' + filename[:-4] + '\n')

我將感謝您的任何幫助，謝謝！

Answer 1

如前所述，您需要轉義'字符。 逃避它的方法是放'\' 。 喜歡doen\'t

from os import listdir

with open("C:/Users/elle/Desktop/Archivess/test/rez.txt", "w") as f:
    for filename in listdir("C:/Users/elle/Desktop/Archivess/test/sources/books/"):
        with open('C:/Users/elle/Desktop/Archivess/test/freqs/books/' + filename) as currentFile:
            text = currentFile.read()
            text = text.strip().lower()
            text = text.replace(".", "").replace(",", "").replace("\"", "").replace("'", "") # replace all .,"'
            words = text.split(" ") # split the text
            unique_words = set(words)
            count_dict = {}
            for each_word in words:
                if(each_word in count_dict):
                    count_dict[each_word] += 1
                else:
                    count_dict[each_word] = 1
            for k in count_dict:
                f.write('The word' + k +'excist in the file ' + filename[:-4] + ' for ' + str(count_dict[k]) + ' number of times' '\n')

#             if ('beautiful' in text):
#                 f.write('The word excist in the file ' + filename[:-4] + '\n')
#             else:
#                 f.write('The word doen\'t excist in the file' + filename[:-4] + '\n')

計算 Python 中的源頻率

問題描述

1 個解決方案

解決方案1
0 2021-01-25 15:20:05

計算 Python 中的源頻率

問題描述

1 個解決方案

解決方案1 0 2021-01-25 15:20:05

解決方案1
0 2021-01-25 15:20:05