[英]Return any number of matching groups with re findall in python
I have a relatively complex string that contains a bunch of data. 我有一个相对复杂的字符串,其中包含一堆数据。 I am trying to extract the relevant pieces of the string using a regex command.
我正在尝试使用regex命令提取字符串的相关部分。 The portions I am interested in are contained in square brackets, like this:
我感兴趣的部分包含在方括号中,如下所示:
s = '"data":["value":3.44}] lol haha "data":["value":55.34}]
"data":["value":2.44}] lol haha "data":["value":56.34}]'
And the regex expression I have built is as follows: 我构建的正则表达式如下:
l = re.findall(r'\"data\"\:.*(\[.*\])', s)
I was expecting this to return 我原以为这会回来
['["value":3.44}]', '["value":55.34}]', '["value":2.44}]', '["value":56.34}]']
But instead all I get is the last one, ie, 但是我得到的只是最后一个,即
['["value":56.34}]']
How can I catch 'em all? 我怎么能抓住它们?
It's because quantifiers are greedy by default. 这是因为默认情况下量词是贪婪的。 So
.*
will match everything between the first "data":
and the last [
, so there's only one [...]
left to match. 因此
.*
将匹配第一个"data":
和最后一个[
"data":
之间的所有内容,因此只剩下一个[...]
即可匹配。
Use non-greedy quantifiers by adding ?
通过添加
?
使用非贪婪量词 . 。
l = re.findall(r'\"data\"\:.*?(\[.*?\])', s)
You can also use finditer
to extract the relevant content iteratively: 您还可以使用
finditer
迭代提取相关内容:
import re
s = '"data":["value":3.44}] lol haha "data":["value":55.34}] "data":["value":2.44}] lol haha "data":["value":56.34}]'
for m in re.finditer(r'(\[.*?\])', s):
print m.group(1)
OUTPUT OUTPUT
["value":3.44}]
["value":55.34}]
["value":2.44}]
["value":56.34}]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.