简体   繁体   中英

Python regex: How can I match start of string in a selection?

I want to match some digits preceded by a non-digit or at the start of the string.

As the caret has no special meaning inside brackets I can't use that one, so I checked the reference and discovered the alternate form \\A .

However, when I try to use it I get an error:

>>> s = '123'
>>> re.findall('[\D\A]\d+', s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
    return _compile(pattern, flags).findall(string)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 245, in _compile
    raise error, v # invalid expression
sre_constants.error: internal: unsupported set operator

What am I doing wrong?

You can use a negative lookbehind:

(?<!\d)\d+

Your problem is that you are using \\A (a zero width assertion) in a character class, which is for matching a single character. You could write it like (?:\\D|\\A) instead, but a lookbehind is nicer.

Repetition in regular expressions is greedy by default, so using re.findall() with the regex \\d+ will get you exactly what you want:

re.findall(r'\d+', s)

As a side note, you should be using raw strings when writing regular expressions to make sure the backslashes are interpreted properly.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM