匹配char的Python RegEx，后跟/以相同的char开头，但大写/小写

Question

I am trying to build a regex which will find : aA, Aa, bB, cC but won't fit to : aB, aa, AA, aC, Ca. 我正在尝试建立一个正则表达式，它将找到： aA，AA，bB，cC，但不适合：aB，aa，AA，aC，Ca。

-if we meed lowercase letter we want to check if next/previous letter is uppercase -if we meed uppercase letter we want to check if next/previous letter is lowercase -both uppercase/lowercase letters shouldnt get found by our regex -如果我们忽略小写字母，我们要检查下一个/上一个字母是否为大写字母-如果我们忽略大写字母，我们要检查下一个/上一个字母是否为小写字母-正则表达式不应该找到大写/小写字母

I want any char to be followed/preceded by the SAME CHAR but uppercase. 我希望任何字符都跟在/之前，但必须是大写。

Answer 1

You may do it with PyPi regex module (note it will work with Java, PCRE (PHP, R, Delphi), Perl, .NET, but won't work with ECMAScript (JavaScript, C++ std::regex ), RE2 (Go, Google Apps Script)) using 您可以使用PyPi regex模块（请注意它将与Java，PCRE（PHP，R，Delphi），Perl，.NET一起使用），但不适用于ECMAScript（JavaScript，C ++ std::regex ），RE2（转到，Google Apps脚本））使用

(\p{L})(?!\1)(?i:\1)

See the regex demo and a proof it works in Python : 查看regex演示及其在Python中工作的证明：

import regex
rx = r'(\p{L})(?!\1)(?i:\1)'
print([x.group() for x in regex.finditer(rx, ' aA, Aa, bB, cC but not aB, aa, AA, aC, Ca')])
# => ['aA', 'Aa', 'bB', 'cC']

The solution is based on the inline modifier group (?i:...) inside which all chars are treated in a case insensitive way while other parts are case sensitive (granted there are no other (?i) or re.I ). 该解决方案基于内联修饰符组(?i:...) ，其中所有字符均以不区分大小写的方式处理，而其他部分则区分大小写（允许没有其他(?i)或re.I ）。

Details 细节

(\\p{L}) - any letter captured into Group 1 (\\p{L}) -捕获到第1组的任何字母
(?!\\1) - a negative lookahead that fails the match if the next char is absolutely identical to the one captured in Group 1 - note that the regex index is still right after the char captured with (\\p{L}) (?!\\1) -如果下一个字符与组1中捕获的字符完全相同，则负匹配将使匹配失败。-请注意，正则表达式索引仍然紧跟在用(\\p{L})捕获的字符之后
(?i:\\1) - a case insensitive modifier group that contains a backreference to the value of Group 1 but since it matches it in a case insensitive way it could match both a and A - BUT the preceding lookahead excludes the variant with the alternate case (since the preceding \\1 matched in a case sensitive way). (?i:\\1) -不区分大小写的修饰符组，它包含对组1的值的反向引用，但由于它以不区分大小写的方式与组1匹配，因此它既可以匹配a也可以匹配A但是，前面的前瞻排除了带有备用大小写（因为前面的\\1以区分大小写的方式匹配）。

What about a re solution? 怎么样一个re的解决方案？

In re , you cannot make part of a pattern optional as (?i) in any part of a pattern makes all of it case insensitive. 在re ，不能将模式的一部分设为可选，因为模式的任何部分中的(?i)会使所有模式不区分大小写。 Besides, re does not support modifier groups. 此外， re不支持修饰符组。

You may use something like 您可以使用类似

import re
rx = r'(?i)([^\W\d_])(\1)'
print([x.group() for x in re.finditer(rx, ' aA, Aa, bB, cC but not aB, aa, AA, aC, Ca') if x.group(1) != x.group(2)])

See the Python demo . 参见Python演示。

(?i) - set the whole regex case insensitive (?i) -设置整个正则表达式不区分大小写
([^\\W\\d_]) - a letter is captured into Group 1 ([^\\W\\d_]) _ ([^\\W\\d_]) -一个字母被捕获到第1组中
(\\1) - the same letter is captured into Group 2 (case insensitive, so Aa , aA , aa and AA will match). (\\1) -同一字母被捕获到第2组中（不区分大小写，因此Aa ， aA ， aa和AA将匹配）。

The if x.group(1) != x.group(2) condition filters out the unwanted matches. if x.group(1) != x.group(2)条件会过滤掉不需要的匹配项。

Answer 2

This can be done with re : 这可以通过re完成：

import re
import string

pattern = re.compile('|'.join([''.join(i) for i in zip(list(string.ascii_lowercase), list(string.ascii_uppercase))])
pattern.search(your_text)

If you're looking for a repeated letter that switches case (either lower to upper or upper to lower), then you can use: 如果您正在寻找一个重复的字母来切换大小写（从上到下或从上到下），则可以使用：

pattern = '|'.join([''.join(i) for i in zip(list(string.ascii_uppercase), list(string.ascii_lowercase))] + [''.join(i) for i in zip(list(string.ascii_lowercase), list(string.ascii_uppercase))])

匹配char的Python RegEx，后跟/以相同的char开头，但大写/小写

问题描述

2 个解决方案

解决方案1
4 已采纳 2018-12-06 07:51:08

解决方案2
0 2018-12-05 20:02:42

匹配char的Python RegEx，后跟/以相同的char开头，但大写/小写

问题描述

2 个解决方案

解决方案1 4 已采纳 2018-12-06 07:51:08

解决方案2 0 2018-12-05 20:02:42

解决方案1
4 已采纳 2018-12-06 07:51:08

解决方案2
0 2018-12-05 20:02:42