忽略 python 正则表达式匹配中的特定字符

Question

I've been trying to extract some values from strings like these: '5 bucks' and also be able to get '5bucks' but ignore the word bucks when it comes alone without any number in front of it.我一直在尝试从这样的字符串中提取一些值：'5 bucks' 并且也能够得到 '5bucks' 但是当它单独出现时忽略单词 bucks 前面没有任何数字。 I've been trying with this regex:我一直在尝试使用这个正则表达式：

(\d*)(?:\s?)(?=bucks|dollars)

and testing on https://regex101.com/ .并在https://regex101.com/上进行测试。 It's giving me two matches instead of one, using the very same string.它给了我两个匹配而不是一个，使用相同的字符串。 Why is that?这是为什么？ That's what im getting:这就是我得到的：

Match 1:第一场比赛：

Full match: 5全场比赛：5

Group 1: 5第 1 组：5

Match 2:比赛2：

Full match:全场比赛：

Group 1:第一组：

On the second match it appears to be both empty.在第二场比赛中，它似乎都是空的。 Is there a way to prevent my regex on finding these len 0 matches?有没有办法阻止我的正则表达式找到这些 len 0 匹配项？ Or any way i could treat that?或者我可以用什么方法治疗它？

Answer 1

You get those matches because you match optional digits \d* and an optional whitespace char \s?你得到这些匹配是因为你匹配可选数字\d*和可选的空白字符\s? where the positive lookahead assertion it true as bucks or dollars is on the right.正确的前瞻断言是正确的美元或美元。

To get both variations, you could use an alternation |要获得这两种变体，您可以使用交替| with a non capturing group.与非捕获组。 To prevent the words being part of a larger word, you could add word boundaries \b为了防止单词成为更大单词的一部分，您可以添加单词边界\b

\b\d+ ?(?:bucks|dollars)\b

Regex demo正则表达式演示

Answer 2

'(\d+)\s*(bucks|dollars)?'

And then pick the first item matched.然后选择第一个匹配的项目。

忽略 python 正则表达式匹配中的特定字符

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-07-06 17:58:43

解决方案2
0 2020-07-06 17:35:40

忽略 python 正则表达式匹配中的特定字符

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-07-06 17:58:43

解决方案2 0 2020-07-06 17:35:40

解决方案1
1 已采纳 2020-07-06 17:58:43

解决方案2
0 2020-07-06 17:35:40