python re：r'\\ b \\ $ \\ d + \\ b'将不匹配'aug 12,2010 abc $ 123'

Question

so i'm just making a script to collect $ values from a transaction log type file 所以我只是制作一个脚本来从事务日志类型文件中收集$值

for line in sys.stdin:
    match = re.match( r'\b \$ (\d+) \b', line)
    if match is not None:
            for value in match.groups():
                    print value

right now I'm just trying to print those values it would match a line containing $12323 but not when there are other things in the line From what I read it should work, but looks like I could be missing something 现在我只是试图打印那些与包含12323美元的行相匹配的值，但是当行中还有其他内容的时候没有。从我读到的它应该可行，但看起来我可能会遗漏某些东西

Answer 1

re.match : re.match ：

If zero or more characters at the beginning of string match this regular expression, return a corresponding MatchObject instance. 如果字符串开头的零个或多个字符与此正则表达式匹配，则返回相应的MatchObject实例。 Return None if the string does not match the pattern; 如果字符串与模式不匹配，则返回None; note that this is different from a zero-length match. 请注意，这与零长度匹配不同。

What your are looking for is either re.search or re.findall : 您正在寻找的是re.search或re.findall ：

#!/usr/bin/env python

import re
s = 'aug 12, 2010 abc $123'

print re.findall(r'\$(\d+)', s)
# => ['123']

print re.search(r'\$(\d+)', s).group()
# => $123

print re.search(r'\$(\d+)', s).group(1)
# => 123

Answer 2

By having a space between \\$ and (\\d+) , the regex expects a space in your string between them. 通过在\\$和(\\d+)之间留一个空格，正则表达式期望在它们之间的字符串中有一个空格。 Is there such a space? 有这样的空间吗？

Answer 3

I am not so clear what is accepted for you but from statement 我不太清楚接受你的是什么，而是声明

a line containing $12323 but not when there are other things in the line 包含$ 12323的行，但在行中还有其他内容时则不行

I would get that 我会那样的

'aug 12, 2010 abc $123' 'aug 12,2010 abc $ 123'

Is not supposed to match as it has other text befor the amount. 不应该匹配，因为它有其他文本的金额。

If you want to match amount at end of the line here is the customary anti-regexp answer (even I am not against of using them in easy cases): 如果你想在这一行的最后匹配数量，那么这是习惯性的反正则表达式答案（即使我不反对在简单的情况下使用它们）：

loglines = ['aug 12, 2010 abc $123', " $1 ", "a $1 amount", "exactly $1 - no less"]

# match $amount at end of line without other text after
for line in loglines:
    if '$' in line:
        _,_, amount = line.rpartition('$')
        try:
            amount = float(amount)
        except:
            pass
        else:
            print "$%.2f" % amount

Answer 4

Others have already pointed out some shortcomings of your regex (especially the mandatory spaces and re.match vs. re.search ). 其他人已经指出了你的正则表达式的一些缺点（特别是强制空格和re.match与re.search ）。

There is another thing, though: \\b word anchors match between alphanumeric and non-alphanumeric characters. 但是还有另一件事： \\b字锚在字母数字字符和非字母数字字符之间匹配。 In other words, \\b \\$ will fail (even when doing a search instead of a match operation) unless the string has some alphanumeric characters before the space. 换句话说， \\b \\$将失败（即使在进行搜索而不是匹配操作时），除非该字符串在空格之前有一些字母数字字符。

Example (admittedly contrived) to work with your regex: 与你的正则表达式一起使用的例子（公认的做作）：

>>> import re
>>> test = [" $1 ", "a $1 amount", "exactly $1 - no less"]
>>> for string in test:
...     print(re.search(r"\b \$\d+ \b", string))
...
None
<_sre.SRE_Match object at 0x0000000001DD4370>
None

python re：r'\\ b \\ $ \\ d + \\ b'将不匹配'aug 12,2010 abc $ 123'

问题描述

4 个解决方案

解决方案1
6 已采纳 2010-09-16 15:45:13

解决方案2
3 2010-09-16 15:43:18

解决方案3
1 2010-09-16 18:22:46

解决方案4
0 2010-09-16 16:15:02

python re：r&#39;\\ b \\ $ \\ d + \\ b&#39;将不匹配&#39;aug 12,2010 abc $ 123&#39;

问题描述

4 个解决方案

解决方案1 6 已采纳 2010-09-16 15:45:13

解决方案2 3 2010-09-16 15:43:18

解决方案3 1 2010-09-16 18:22:46

解决方案4 0 2010-09-16 16:15:02

python re：r'\\ b \\ $ \\ d + \\ b'将不匹配'aug 12,2010 abc $ 123'

解决方案1
6 已采纳 2010-09-16 15:45:13

解决方案2
3 2010-09-16 15:43:18

解决方案3
1 2010-09-16 18:22:46

解决方案4
0 2010-09-16 16:15:02