![](/img/trans.png)
[英]Counting how many times a word appears in a text using python and Counters
[英]Regular expressions - counting how many times a word appears in a text
我试图设置的是一个函数,它给定某个文本将打印出单词['color', 'Colour', 'Color','Colour']
出现的次数。 所以我得到以下结果:
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4
我拥有的是
import re
def colorcount(text):
all_matches = re.findall('color', 'Colour', 'Color'. 'Colour', text)
return len(all_matches)
print(colorcount(text)
它不起作用,那么我如何编写代码使其按照我的意愿工作?
如果你想使用正则表达式,你可以这样做:
import re
def colorcount(text):
r = re.compile(r'\bcolour\b | \bcolor\b', flags = re.I | re.X)
count = len(r.findall(text))
print(count)
return count
# These asserts work as expected without raising an AssertionError.
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4
哪些输出:
1
2
4
您可以简单地将文本转换为特定大小写(即全部较低),然后使用字符串的count()
来循环每次出现的关键字:
def colorcount(text):
KEY_WORDS = [ 'color', 'colour' ]
counter = 0
sanitexed_text = text.lower()
for kw in KEY_WORDS:
counter += sanitexed_text.count(kw.lower())
return counter
text = 'color ds das Colour dsafasft e re Color'
# 3
print(colorcount(text))
# All following asserts pass
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4
尝试这个
def colorcount(text):
return len(re.findall('[c|C]olou{0,1}r', text))
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4
使用以下带有标志re.I
(不区分大小写)和re.findll
正则表达式,然后返回返回列表的长度:
\bcolou?r\b
import re
def colorcount(text):
return len(re.findall(r'\bcolou?r\b', text, flags=re.I))
print(colorcount('color Color colour Colour'))
印刷:
4
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.