简体   繁体   English

Python正则表达式:如何在选择中匹配字符串的开头?

[英]Python regex: How can I match start of string in a selection?

I want to match some digits preceded by a non-digit or at the start of the string. 我希望匹配一些前面带有非数字或字符串开头的数字。

As the caret has no special meaning inside brackets I can't use that one, so I checked the reference and discovered the alternate form \\A . 因为插入符号在括号内没有特殊含义我不能使用那个,所以我检查了引用并发现了替代形式\\A

However, when I try to use it I get an error: 但是,当我尝试使用它时,我收到一个错误:

>>> s = '123'
>>> re.findall('[\D\A]\d+', s)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 177, in findall
    return _compile(pattern, flags).findall(string)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/re.py", line 245, in _compile
    raise error, v # invalid expression
sre_constants.error: internal: unsupported set operator

What am I doing wrong? 我究竟做错了什么?

You can use a negative lookbehind: 你可以使用负面的lookbehind:

(?<!\d)\d+

Your problem is that you are using \\A (a zero width assertion) in a character class, which is for matching a single character. 您的问题是您在字符类中使用\\A (零宽度断言),用于匹配单个字符。 You could write it like (?:\\D|\\A) instead, but a lookbehind is nicer. 你可以把它写成(?:\\D|\\A) ,但是看起来更好。

Repetition in regular expressions is greedy by default, so using re.findall() with the regex \\d+ will get you exactly what you want: 默认情况下,正则表达式中的重复是贪婪的,因此将re.findall()与正则表达式\\d+将获得您想要的内容:

re.findall(r'\d+', s)

As a side note, you should be using raw strings when writing regular expressions to make sure the backslashes are interpreted properly. 作为旁注,在编写正则表达式时应该使用原始字符串以确保正确解释反斜杠。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM