Python Regex用于忽略具有两个连续大写字母的句子

Question

我手头有一个简单的问题就是忽略包含两个或更多连续大写字母和更多语法规则的句子。

问题：根据定义，正则表达式不应该匹配字符串'This is something with two CAPS.' ，但确实匹配。

码：

''' Check if a given sentence conforms to given grammar rules

    $ Rules
        * Sentence must start with a Uppercase character (e.g. Noun/ I/ We/ He etc.)
        * Then lowercase character follows.
        * There must be spaces between words.
        * Then the sentence must end with a full stop(.) after a word.
        * Two continuous spaces are not allowed.
        * Two continuous upper case characters are not allowed.
        * However the sentence can end after an upper case character.
'''

import re


# Returns true if sentence follows these rules else returns false
def check_sentence(sentence):
    checker = re.compile(r"^((^(?![A-Z][A-Z]+))([A-Z][a-z]+)(\s\w+)+\.$)")
    return checker.match(sentence)

print(check_sentence('This is something with two CAPS.'))

输出：

<_sre.SRE_Match object; span=(0, 32), match='This is something with two CAPS.'>

Answer 1

将你的正则表达式写成负数（找到所有不好的句子的句子）可能比在正数中更容易。

checker = re.compile(r'([A-Z][A-Z]|[ ][ ]|^[a-z])')
check2 = re.compile(r'^[A-Z][a-z].* .*\.$')
return not checker.findall(sentence) and check2.findall(sentence)

Answer 2

您的负向前瞻仅适用于正在测试的字符串的开头。

第二捕获组(^(?![AZ][AZ]+))

^断言字符串开头的位置

否定前瞻(?![AZ][AZ]+)

"This will NOT fail."

"THIS will fail."

Python Regex用于忽略具有两个连续大写字母的句子

问题描述

2 个解决方案

解决方案1
0 已采纳 2016-10-09 04:39:01

解决方案2
0 2016-10-09 04:47:06

Python Regex用于忽略具有两个连续大写字母的句子

问题描述

2 个解决方案

解决方案1 0 已采纳 2016-10-09 04:39:01

解决方案2 0 2016-10-09 04:47:06

解决方案1
0 已采纳 2016-10-09 04:39:01

解决方案2
0 2016-10-09 04:47:06