Python：如何搜索和計算文件中字符串中根詞的出現次數？

Question

假設我們要計算文件中“希望”一詞的出現頻率。 但是我們的台詞包含其他詞，例如“有希望”、“有希望”或“無望”。

我能夠編寫一個小代碼來打開一個文件並搜索特定的詞，例如“絕望”並計算它的出現次數。

def read_file():
    Lines = "empty.txt"
    fileName = "feedbacks.txt"
    if fileName != None:
        mode = "r"
        try:
            Lines  = open(fileName,mode)
        except IOError as e:
            e = "file can't be open"
    return Lines

def freq(Lines, str):
    words = Lines.split()
    words_list = []
    for i in words:
        if i == str:
            words_list.append(word)
    print(len(words_list))

Lines = read_file().read()

freq(Lines, "hopelessly") # output is 3
freq(Lines, "hopeless") # output is 4
freq(Lines, "hopeful") # output is 2

但是如何搜索包含詞根的所有單詞，例如：“hope”？

PS：我是 Python 的新手

Answer 1

如果您知道要查找的詞根，則可以檢查in而不是相等：

def freq(Lines, str):
    words = Lines.split()
    words_list = []
    for i in words:
        if i in str: # this is changed
            words_list.append(word)
    print(len(words_list))

Answer 2

def freq(Lines, str):
    words = Lines.split()
    words_list = []
    for i in words:
        if str in i:
            words_list.append(word)
    print(len(words_list))

然后調用：

freq(Lines, "hope")

如果你想檢查你的單詞是否以字符串開頭，你可以使用：

if i.startswith(str)

Answer 3

import re

text = """hopelessly") # output is 3
freq(Lines, "hopeless") # output is 4
freq(Lines, "hopeful"""

matches = re.findall(r"hope[a-z]*",text)
print(matches)

此代碼產生輸出 ['hopeless', 'hopeless', 'hopeful']

len(matches) -> 返回計數

用你的有效載荷替換文本，對於簡單的用法或不變的詞根應該可以解決問題。

Python：如何搜索和計算文件中字符串中根詞的出現次數？

問題描述

3 個解決方案

解決方案1
1 2020-01-28 13:52:13

解決方案2
1 已采納 2020-01-28 13:55:31

解決方案3
1 2020-01-28 14:05:44

Python：如何搜索和計算文件中字符串中根詞的出現次數？

問題描述

3 個解決方案

解決方案1 1 2020-01-28 13:52:13

解決方案2 1 已采納 2020-01-28 13:55:31

解決方案3 1 2020-01-28 14:05:44

解決方案1
1 2020-01-28 13:52:13

解決方案2
1 已采納 2020-01-28 13:55:31

解決方案3
1 2020-01-28 14:05:44