正則表達式整個字符串匹配數字之間

Question

我想從一個句子中提取整個單詞。 感謝這個答案，

import re

def findWholeWord(w):
    return re.compile(r'\b({0})\b'.format(w), flags=re.IGNORECASE).search

在以下情況下，我可以得到完整的單詞：

findWholeWord('thomas')('this is Thomas again')   # -> <match object>
findWholeWord('thomas')('this is,Thomas again')   # -> <match object>
findWholeWord('thomas')('this is,Thomas, again')  # -> <match object>
findWholeWord('thomas')('this is.Thomas, again')  # -> <match object>
findWholeWord('thomas')('this is ?Thomas again')  # -> <match object>

單詞旁邊的符號不會打擾。

但是，如果有一個數字，它就找不到這個詞。

我應該如何修改表達式以匹配單詞旁邊有數字的情況？ 喜歡：

findWholeWord('thomas')('this is 9Thomas, again')
findWholeWord('thomas')('this is9Thomas again')
findWholeWord('thomas')('this is Thomas36 again')

Answer 1

可以使用正則表達式(?:\d|\b){0}(?:\d|\b)將目標單詞與單詞邊界或兩側的數字進行匹配。

import re

def findWholeWord(w):
    return re.compile(r'(?:\d|\b){0}(?:\d|\b)'.format(w), flags=re.I).search

for s in [
    'this is Thomas again',
    'this is,Thomas again',
    'this is,Thomas, again',
    'this is.Thomas, again',
    'this is ?Thomas again',
    'this is 9Thomas, again',
    'this is9Thomas again',
    'this is Thomas36 again',
    'this is -Thomas- again',
    'athomas is no match',
    'thomason no match']:
    print("match >" if findWholeWord('thomas')(s) else "*no match* >", s)

Output：

match > this is Thomas again
match > this is,Thomas again
match > this is,Thomas, again
match > this is.Thomas, again
match > this is ?Thomas again
match > this is 9Thomas, again
match > this is9Thomas again
match > this is Thomas36 again
match > this is -Thomas- again
*no match* > athomas is no match
*no match* > thomason no match

如果您想針對多個輸入或在循環中重用相同的目標詞，則可以將findWholeWord()調用分配給一個變量，然后調用它。

matcher = findWholeWord('thomas')
print(matcher('this is Thomas again'))
print(matcher('this is,Thomas again'))

Answer 2

您可以使用以下代碼：

import re

def findWholeWord(w):
    return re.compile(r'(?:\d+{0}|{0}\d+|\b{0}\b)'.format(w), flags=re.I).search


print ( findWholeWord('thomas')('this is 9Thomas, again') )
print ( findWholeWord('thomas')('this is9Thomas again') )
print ( findWholeWord('thomas')('this is Thomas36 again') )
print ( findWholeWord('thomas')('this is Thomas again') )
print ( findWholeWord('thomas')('this is,Thomas again') )
print ( findWholeWord('thomas')('this is,Thomas, again') )
print ( findWholeWord('thomas')('this is.Thomas, again') )
print ( findWholeWord('thomas')('this is ?Thomas again') )
print ( findWholeWord('thomas')('this is aThomas again') )

Output：

<re.Match object; span=(8, 15), match='9Thomas'>
<re.Match object; span=(7, 14), match='9Thomas'>
<re.Match object; span=(8, 16), match='Thomas36'>
<re.Match object; span=(8, 14), match='Thomas'>
<re.Match object; span=(8, 14), match='Thomas'>
<re.Match object; span=(8, 14), match='Thomas'>
<re.Match object; span=(8, 14), match='Thomas'>
<re.Match object; span=(9, 15), match='Thomas'>
None

(?:\d+{0}|{0}\d+|\b{0}\b)將匹配給定的單詞，其兩側有 1 個以上的數字或完整的單詞。

正則表達式整個字符串匹配數字之間

問題描述

2 個解決方案

解決方案1
2 已采納 2022-09-19 16:32:32

解決方案2
1 2022-09-19 16:23:02

正則表達式整個字符串匹配數字之間

問題描述

2 個解決方案

解決方案1 2 已采納 2022-09-19 16:32:32

解決方案2 1 2022-09-19 16:23:02

解決方案1
2 已采納 2022-09-19 16:32:32

解決方案2
1 2022-09-19 16:23:02