简体   繁体   English

在正则表达式Python中捕获$

[英]Capture $ in regex Python

I'm trying to capture the dollar amount in a line: 我正在尝试捕获一行中的美元金额:

example: blah blah blah (blah $23.32 blah) blah blac (blah) I want to capture "$23.32" 例如: blah blah blah (blah $23.32 blah) blah blac (blah)我想捕捉“ $ 23.32”

This is what I'm using: r'?([\\$][.*]+)' 这就是我正在使用的: r'?([\\$][.*]+)'

I'm telling it to find one occurance of (...) with ? 我告诉它找到(...)的一次出现? Then I tell it to find something which starts of with a "$" and any character which may come after (so I can get the decimal point also). 然后,我告诉它查找以“ $”开头的内容以及之后可能出现的任何字符(因此我也可以得到小数点)。

However, I get an error of error: nothing to repeat 但是,我得到一个错误error: nothing to repeat

The question mark at the start is the cause of the nothing to repeat error. 开头的问号是nothing to repeat错误的原因。

>>> import re
>>> re.compile(r'?')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mj/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 190, in compile
    return _compile(pattern, flags)
  File "/Users/mj/Development/venvs/stackoverflow-2.7/lib/python2.7/re.py", line 242, in _compile
    raise error, v # invalid expression
sre_constants.error: nothing to repeat

Match the dollar plus digits and dots: 匹配美元加上数字和点:

r'\$[\d.]+'

Demo: 演示:

>>> re.search(r'\$[\d.]+', 'blah blah blah (blah $23.32 blah) blah blac (blah)').group()
'$23.32'

You should improve your basics about regular expressions. 您应该改善有关正则表达式的基础知识。 The error is due to the ? 该错误是由于? at the befinning. 在确定。 It's a quantifier and there is nothing before this quantifier. 这是一个量词,在此量词之前没有任何内容。 Your use of * and + makes also not much sense. 您使用*和+也没有太大意义。 Without knowing your exact requirements it's hard to propose a better solution, because there are too many problems with your regex. 在不知道您的确切要求的情况下,很难提出一个更好的解决方案,因为您的正则表达式存在太多问题。

Well, according to http://docs.python.org/2/library/re.html , [.*]+ would match .*..* , *....* , *.*.*. 好吧,根据http://docs.python.org/2/library/re.html,[。* [.*]+将匹配.*..**....**.*.*. etc... As special characters lose their meanings in sets. 等等...由于特殊字符在集合中失去了意义。 Use [.\\d]+ or [.0-9]+ instead. 使用[.\\d]+[.0-9]+

While suggestions for regexes are the way to go for more complicated patterns (and well worth the time to learn in general), there are other ways for simple cases. 尽管对正则表达式的建议是采用更复杂模式的方式(并且值得花一般时间来学习),但对于简单案例,还有其他方式。 If I'm understanding the question, it seems that a little list comprehension, like: 如果我理解这个问题,似乎有点列表理解,例如:

x='blah blah blah (blah $23.32 blah) blah blac (blah)'
[i for i in x.split() if i.find('$') > -1]

would be a pretty concise way to go. 将是一个非常简洁的方法。 It returns a list of strings. 它返回一个字符串列表。

['$23.32']

or, if there are no matches found, 或者,如果找不到匹配项,

[]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM