簡體   English   中英

正則表達式 - 計算一個單詞在文本中出現的次數

[英]Regular expressions - counting how many times a word appears in a text

我試圖設置的是一個函數,它給定某個文本將打印出單詞['color', 'Colour', 'Color','Colour']出現的次數。 所以我得到以下結果:

assert colorcount("Color Purple") == 1

assert colorcount("Your colour is better than my colour") == 2

assert colorcount("color Color colour Colour") == 4

我擁有的是

import re

def colorcount(text):

all_matches = re.findall('color', 'Colour', 'Color'. 'Colour', text)

return len(all_matches)

print(colorcount(text)

它不起作用,那么我如何編寫代碼使其按照我的意願工作?

如果你想使用正則表達式,你可以這樣做:

import re

def colorcount(text):
  r = re.compile(r'\bcolour\b | \bcolor\b', flags = re.I | re.X)
  count = len(r.findall(text))
  print(count)
  return count

# These asserts work as expected without raising an AssertionError.
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4

哪些輸出:

1
2
4

您可以簡單地將文本轉換為特定大小寫(即全部較低),然后使用字符串的count()來循環每次出現的關鍵字:

def colorcount(text):
    KEY_WORDS = [ 'color', 'colour' ]
    counter = 0
    sanitexed_text = text.lower()
    for kw in KEY_WORDS:
        counter += sanitexed_text.count(kw.lower())
    return counter

text = 'color ds das Colour dsafasft e re Color'

# 3
print(colorcount(text))

# All following asserts pass
assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4

嘗試這個

def colorcount(text):
    return len(re.findall('[c|C]olou{0,1}r', text))

assert colorcount("Color Purple") == 1
assert colorcount("Your colour is better than my colour") == 2
assert colorcount("color Color colour Colour") == 4

使用以下帶有標志re.I (不區分大小寫)和re.findll正則表達式,然后返回返回列表的長度:

\bcolou?r\b
import re

def colorcount(text):
  return len(re.findall(r'\bcolou?r\b', text, flags=re.I))

print(colorcount('color Color colour Colour'))

印刷:

4

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM