简体   繁体   English

字符串regexp中多次出现相同字符 - Python

[英]Multiple occurences of same character in a string regexp - Python

Given a string made up of 3 capital letters, 1 small caps and another 3 capital ones, eg AAAaAAA 给出一个由3个大写字母组成的字符串,1个小型大写字母和3个大写字母,例如AAAaAAA

I can't seem to find a regexp that would find a string which matches a string that has: 我似乎无法找到一个正则表达式,它会找到一个匹配字符串的字符串:

  • first 3 capital letters all different 前三个大写字母都不同
  • any small caps letter 任何小型大写字母
  • first 2 same capital letters as the very first one 前两个大写字母与第一个相同
  • last capital letter the same as the last capital letter in the first "trio" 最后一个大写字母与第一个“三重奏”中的最后一个大写字母相同

eg A B C a AA C (no spaces) 例如A B C a AA C (无空格)

EDIT: 编辑:

Turns out I needed something slightly different eg ABCaAAC where 'a' is the small caps version of the very fist character, not just any character 结果我需要一些略有不同的东西,例如ABCaAAC,其中'a'是非常拳头角色的小型帽子版本,而不仅仅是任何角色

The following should work: 以下应该有效:

^([A-Z])(?!.?\1)([A-Z])(?!\2)([A-Z])[a-z]\1\1\3$

For example: 例如:

>>> regex = re.compile(r'^([A-Z])(?!.?\1)([A-Z])(?!\2)([A-Z])[a-z]\1\1\3$')
>>> regex.match('ABAaAAA')  # fails: first three are not different
>>> regex.match('ABCaABC')  # fails: first two of second three are not first char
>>> regex.match('ABCaAAB')  # fails: last char is not last of first three
>>> regex.match('ABCaAAC')  # matches!
<_sre.SRE_Match object at 0x7fe09a44a880>

Explanation: 说明:

^          # start of string
([A-Z])    # match any uppercase character, place in \1
(?!.?\1)   # fail if either of the next two characters are the previous character
([A-Z])    # match any uppercase character, place in \2
(?!\2)     # fail if next character is same as the previous character
([A-Z])    # match any uppercase character, place in \3
[a-z]      # match any lowercase character
\1         # match capture group 1
\1         # match capture group 1
\3         # match capture group 3
$          # end of string

If you want to pull these matches out from a larger chunk of text, just get rid of the ^ and $ and use regex.search() or regex.findall() . 如果你想从更大的文本块中拉出这些匹配,只需删除^$并使用regex.search()regex.findall()

You may however find the following approach easier to understand, it uses regex for the basic validation but then uses normal string operations to test all of the extra requirements: 但是,您可能会发现以下方法更容易理解,它使用正则表达式进行基本验证,然后使用常规字符串操作来测试所有额外要求:

def validate(s):
    return (re.match(r'^[A-Z]{3}[a-z][A-Z]{3}$', s) and s[4] == s[0] and 
            s[5] == s[0] and s[-1] == s[2] and len(set(s[:3])) == 3)

>>> validate('ABAaAAA')
False
>>> validate('ABCaABC')
False
>>> validate('ABCaAAB')
False
>>> validate('ABCaAAC')
True

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 用python替换字符串中某个字符的多次出现 - Replacing multiple occurences of a character in a string with python python - 如何在字符的所有出现处拆分字符串,但最后一个出现在python中? - How to split a string on all occurences of a character but the last one in python? 按 Python 中出现的所有非字母字符拆分字符串 - split string by all non alphabetic character occurences in Python Python 计算字符串的出现次数并打印包含它们的行,以及打印字符串出现的次数,带有多个子句 - Python count occurences of a string and print the lines that contain them, as well as print number of occurences of string, with multiple clauses Python正则表达式,用于查找具有多个变体的字符串的所有出现 - Python regular expression to find all occurences of a string with multiple variations 正则表达式在字符串的每一侧匹配相同数量的相同字符 - Regexp matching equal number of the same character on each side of a string python中的错误字符正则表达式 - bad character regexp in python 在Python中扩展字符串中的所有出现 - Extend all occurences in a string in Python python regexp中的特殊字符问题 - Special Character problem in regexp by python 用python中的另一个字符串替换字符串的某些出现 - Substitution of certain occurences of a string with another string in python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM