簡體   English   中英

計算 Python 中的源頻率

[英]Calculating source frequency in Python

我是 python 的新手; 我正在尋找計算源頻率。 我有文件(來源在標記中),我想找到所有來源中顯示的單詞來計算。 例如,顯示來源的單詞“beautiful”,結果單詞“beautiful”在 5 個來源中。 我已經有 python 代碼來查找一個單詞,但我需要從文件中查找所有單詞,我應該如何更改代碼任何想法?

from os import listdir

with open("C:/Users/elle/Desktop/Archivess/test/rez.txt", "w") as f:
    for filename in listdir("C:/Users/elle/Desktop/Archivess/test/sources/books/"):
        with open('C:/Users/elle/Desktop/Archivess/test/freqs/books/' + filename) as currentFile:
            text = currentFile.read()

            if ('beautiful' in text):
                f.write('The word excist in the file ' + filename[:-4] + '\n')
            else:
                f.write('The word doen't excist in the file' + filename[:-4] + '\n')

我將感謝您的任何幫助,謝謝!

如前所述,您需要轉義'字符。 逃避它的方法是放'\' 喜歡doen\'t

from os import listdir

with open("C:/Users/elle/Desktop/Archivess/test/rez.txt", "w") as f:
    for filename in listdir("C:/Users/elle/Desktop/Archivess/test/sources/books/"):
        with open('C:/Users/elle/Desktop/Archivess/test/freqs/books/' + filename) as currentFile:
            text = currentFile.read()
            text = text.strip().lower()
            text = text.replace(".", "").replace(",", "").replace("\"", "").replace("'", "") # replace all .,"'
            words = text.split(" ") # split the text
            unique_words = set(words)
            count_dict = {}
            for each_word in words:
                if(each_word in count_dict):
                    count_dict[each_word] += 1
                else:
                    count_dict[each_word] = 1
            for k in count_dict:
                f.write('The word' + k +'excist in the file ' + filename[:-4] + ' for ' + str(count_dict[k]) + ' number of times' '\n')

#             if ('beautiful' in text):
#                 f.write('The word excist in the file ' + filename[:-4] + '\n')
#             else:
#                 f.write('The word doen\'t excist in the file' + filename[:-4] + '\n')

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM