正則表達式Dot不起作用

Question

所以我試圖解析一個文件，我有以下代碼：

def learn_re(s):
pattern=re.compile("[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3} .")
if pattern.match(s):
    return True
return False

這符合“01：01：01.123 - ”; 但是，當我添加一個字符時，它無法工作。 例如，如果我編輯我的代碼，那就是它

def learn_re(s):
pattern=re.compile("[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3} . C")
if pattern.match(s):
    return True
return False

這與“01：01：01.123 - C”無法匹配這里發生了什么？

Answer 1

問題是你的 - 是一個unicode角色。 在str ，它實際上表現得更像幾個字符：

>>> print len('—')
3

但是，如果你使用unicode而不是str ：

>>> print len(u'—')
1

所以，以下將打印True ：

def learn_re(s):
    pattern=re.compile("[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]{3} . C")
    if pattern.match(s):
        return True
    return False

print learn_re(u"01:01:01.123 — C")

請注意，此行為特定於python 2.在python 3中， str和unicode合並為單個str類型，因此不需要這種區別。

Answer 2

你的字符串中的Rhe em破折號是一個unicode字符，它將被解釋為多個字符（在你的情況下為3）。 您的python版本不支持unicode，因此您需要匹配3個字符才能捕獲.{3}破折號，在表達式中完全匹配字符，或使用不同版本的python。

關於你的表達的一些注釋; 您應該始終在正則表達式字符串前加上r'...'以便正確解釋您的\\ escapes。

一. 在正則表達式中具有特殊含義，它將匹配任何單個字符。 如果需要句點/小數點，則需要轉義點\\. 。

pattern = re.compile(r'[0-9]{2}:[0-9]{2}:[0-9]{2}\.[0-9]{3} .')

正則表達式Dot不起作用

問題描述

2 個解決方案

解決方案1
4 已采納 2016-10-04 21:45:30

解決方案2
1 2016-10-04 21:21:42

正則表達式Dot不起作用

問題描述

2 個解決方案

解決方案1 4 已采納 2016-10-04 21:45:30

解決方案2 1 2016-10-04 21:21:42

解決方案1
4 已采納 2016-10-04 21:45:30

解決方案2
1 2016-10-04 21:21:42