![](/img/trans.png)
[英]dictionary comprehension code to count the occurances of a word in a string
[英]Python: how to search and count the occurances of a root word in a string in a file?
假設我們要計算文件中“希望”一詞的出現頻率。 但是我們的台詞包含其他詞,例如“有希望”、“有希望”或“無望”。
我能夠編寫一個小代碼來打開一個文件並搜索特定的詞,例如“絕望”並計算它的出現次數。
def read_file():
Lines = "empty.txt"
fileName = "feedbacks.txt"
if fileName != None:
mode = "r"
try:
Lines = open(fileName,mode)
except IOError as e:
e = "file can't be open"
return Lines
def freq(Lines, str):
words = Lines.split()
words_list = []
for i in words:
if i == str:
words_list.append(word)
print(len(words_list))
Lines = read_file().read()
freq(Lines, "hopelessly") # output is 3
freq(Lines, "hopeless") # output is 4
freq(Lines, "hopeful") # output is 2
但是如何搜索包含詞根的所有單詞,例如:“hope”?
PS:我是 Python 的新手
如果您知道要查找的詞根,則可以檢查in
而不是相等:
def freq(Lines, str):
words = Lines.split()
words_list = []
for i in words:
if i in str: # this is changed
words_list.append(word)
print(len(words_list))
def freq(Lines, str):
words = Lines.split()
words_list = []
for i in words:
if str in i:
words_list.append(word)
print(len(words_list))
然后調用:
freq(Lines, "hope")
如果你想檢查你的單詞是否以字符串開頭,你可以使用:
if i.startswith(str)
import re
text = """hopelessly") # output is 3
freq(Lines, "hopeless") # output is 4
freq(Lines, "hopeful"""
matches = re.findall(r"hope[a-z]*",text)
print(matches)
此代碼產生輸出 ['hopeless', 'hopeless', 'hopeful']
len(matches) -> 返回計數
用你的有效載荷替換文本,對於簡單的用法或不變的詞根應該可以解決問題。
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.