如何计算字符串中仍然使用句点和结尾的单词的出现次数

Question

所以我最近在这里研究这个 function：

# counts owls
def owl_count(text):
    # sets all text to lowercase
    text = text.lower()
    
    # sets text to list
    text = text.split()
    
    # saves indices of owl in list
    indices = [i for i, x in enumerate(text) if x == ["owl"] ]
    
    # counts occurences of owl in text
    owl_count = len(indices)
    
    # returns owl count and indices
    return owl_count, indices

我的目标是计算字符串中出现“owl”的次数并保存它的索引。 我一直遇到的问题是它不会计算“猫头鹰”或“猫头鹰”。 我尝试将其拆分为单个字符列表，但找不到在列表中搜索三个连续元素的方法。 你们对我可以在这里做什么有什么想法吗？

PS。 我绝对是一个初学者程序员，所以这可能是一个简单的解决方案。

谢谢！

Answer 1

如果您不想使用像 NLTK 这样的大型库，您可以过滤以'owl'开头的单词，而不是'owl' ：

indices = [i for i, x in enumerate(text) if x.startswith("owl")]

在这种情况下，像'owlowlowl'这样的词也会通过，但是应该使用 NLTK 来正确标记现实世界中的词。

Answer 2

Python 内置了这些函数。这些类型的字符串匹配属于称为正则表达式的东西，您可以稍后详细介绍 go

a_string = "your string"
substring = "substring that you want to check"

matches = re.finditer(substring, a_string)


matches_positions = [match.start() for match in matches]

print(matches_positions)

finditer() 将返回一个迭代 object 并且 start() 将返回找到的匹配项的起始索引。

简单地说，它返回字符串中所有子字符串的索引

如何计算字符串中仍然使用句点和结尾的单词的出现次数

问题描述

2 个解决方案

解决方案1
1 2021-05-10 14:02:15

解决方案2
1 2021-05-10 14:11:57

如何计算字符串中仍然使用句点和结尾的单词的出现次数

问题描述

2 个解决方案

解决方案1 1 2021-05-10 14:02:15

解决方案2 1 2021-05-10 14:11:57

解决方案1
1 2021-05-10 14:02:15

解决方案2
1 2021-05-10 14:11:57