简体   繁体   English

Python正则表达式只匹配整个字符串

[英]Python regex match whole string only

Is there any easy way to test whether a regex matches an entire string in Python? 有没有简单的方法来测试正则表达式是否匹配Python中的整个字符串? I thought that putting $ at the end would do this, but it turns out that $ doesn't work in the case of trailing newlines. 我认为将$放在最后会这样做,但事实证明$在尾随换行符的情况下不起作用。

For example, the following returns a match, even though that's not what I want. 例如,以下返回匹配,即使这不是我想要的。

re.match(r'\w+$', 'foo\n')

You can use \\Z : 你可以使用\\Z

\\Z

Matches only at the end of the string. 仅匹配字符串末尾的匹配项。

In [5]: re.match(r'\w+\Z', 'foo\n')

In [6]: re.match(r'\w+\Z', 'foo')
Out[6]: <_sre.SRE_Match object; span=(0, 3), match='foo'>

You can use a negative lookahead assertion to require that the $ is not followed by a trailing newline: 您可以使用否定前瞻断言来要求$后跟换行换行符:

>>> re.match(r'\w+$(?!\n)', 'foo\n')
>>> re.match(r'\w+$(?!\n)', 'foo')
<_sre.SRE_Match object; span=(0, 3), match='foo'>

re.MULTILINE is not relevant here; re.MULTILINE与此无关; OP has it turned off and the regex is still matching. OP关闭它,正则表达式仍然匹配。 The problem is that $ always matches right before the trailing newline : 问题是$ 总是 在尾随换行符之前匹配:

When [ re.MULTILINE is] specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); 当指定[ re.MULTILINE ]时,模式字符'^'匹配字符串的开头和每行的开头(紧跟在每个换行符之后); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline). 并且模式字符'$'在字符串的末尾和每行的末尾(紧接在每个换行符之前)匹配。 By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string. 默认情况下, '^'仅匹配字符串的开头, '$'仅匹配字符串的末尾, 紧接在字符串末尾的换行符(如果有)之前。

I have experimentally verified that this works correctly with re.X enabled. 我已通过实验验证,这可以在启用re.X正常运行。

To test whether you matched the entire string, just check if the matched string is as long as the entire string: 要测试是否匹配整个字符串,只需检查匹配的字符串是否与整个字符串一样长:

m = re.match(r".*", mystring)
start, stop = m.span()
if stop-start == len(mystring):
    print("The entire string matched")

Note: This is independent of the question (which you didn't ask) of how to match a trailing newline. 注意:这与如何匹配尾随换行符的问题(您没有问过)无关。

Based on @alexis answer: A method to check for a fullMatch could look like this: 基于@alexis回答:检查fullMatch的方法可能如下所示:

def fullMatch(matchObject, fullString):
    if matchObject is None:
        return False
    start, stop = matchObject.span()
    return stop-start == len(fullString):

Where the fullString is the String on which you apply the regex and the matchObject is the result of matchObject = re.match(yourRegex, fullString) 其中fullString是应用正则表达式的String, matchObjectmatchObject = re.match(yourRegex, fullString)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM