简体   繁体   中英

Regex Expression not matching correctly

I'm tackling a python challenge problem to find a block of text in the format xXXXxXXXx (lower vs upper case, not all X's) in a chunk like this:

jdskvSJNDfbSJneSfnJDKoJIWhsjnfakjn

I have tested the following RegEx and found it correctly matches what I am looking for from this site ( http://www.regexr.com/ ):

'([az])([AZ]){3}([az])([AZ]){3}([az])'

However, when I try to match this expression to the block of text, it just returns the entire string:

In [1]: import re

In [2]: example = 'jdskvSJNDfbSJneSfnJDKoJIWhsjnfakjn'

In [3]: expression = re.compile(r'([a-z])([A-Z]){3}([a-z])([A-Z]){3}([a-z])')

In [4]: found = expression.search(example)

In [5]: print found.string
jdskvSJNDfbSJneSfnJDKoJIWhsjnfakjn

Any ideas? Is my expression incorrect? Also, if there is a simpler way to represent that expression, feel free to let me know. I'm fairly new to RegEx.

You need to return the match group instead of the string attribute.

>>> import re
>>> s = 'jdskvSJNDfbSJneSfnJDKoJIWhsjnfakjn'
>>> rgx = re.compile(r'[a-z][A-Z]{3}[a-z][A-Z]{3}[a-z]')
>>> found = rgx.search(s).group()
>>> print found
nJDKoJIWh

The string attribute always returns the string passed as input to the match. This is clearly documented :

string

The string passed to match() or search().

The problem has nothing to do with the matching, you're just grabbing the wrong thing from the match object. Use match.group(0) (or match.group() ).

Based on xXXXxXXXx if you want upper letters with len 3 and lower with len 1 between them this is what you want :

([a-z])(([A-Z]){3}([a-z]))+

also you can get your search function with group()

print expression.search(example).group(0)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM