[英]Capture numbers inside square brackets
我想捕获方括号内的所有数字。 数字用逗号分隔。 例如,我想从文本中捕获 7、8 和 5, some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.
some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.
我尝试使用以下模式
pat = (?<=\[)[\d,\s]*(\d)[\d,\s]*(?=\])
但似乎对于“[7, 8]”的情况,模式是重叠的,我只得到“8”。
恕我直言,使用后向和前瞻是过度使用正则表达式。 您可能最好捕获整个模式,然后切掉第一个和最后一个括号。 这样的事情更容易理解和理解:
import re
sample = r"""
some text [7,8], some other [2, 3] texts with 1 or 2 numbers [5]. [4,
5] other texts
"""
result = [ s[1:-1] for s in re.findall(r'\[\d+\s*(?:,\s*\d+)*\]', sample) ]
print(result)
如果您真的想使用正则表达式来捕获结果,则可以使用:
result = re.findall(r'\[(\d+\s*(?:,\s*\d+)*)\]', sample)
print(result)
使用PyPi 正则表达式:
import regex
pat = r'\[(?P<numbers>\d+)(?:,\s*(?P<numbers>\d+))*]'
s = r'some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.'
results = [match.captures('numbers') for match in regex.finditer(pat, s)]
print(results)
参见Python 证明。
结果: [['7', '8'], ['5']]
。
表达解释
--------------------------------------------------------------------------------
\[ '['
--------------------------------------------------------------------------------
(?P<numbers> group and capture to "numbers":
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times (matching
the most amount possible))
--------------------------------------------------------------------------------
) end of \k<numbers>
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
, ','
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0
or more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?P<numbers> group and capture to "numbers":
--------------------------------------------------------------------------------
\d+ digits (0-9) (1 or more times
(matching the most amount possible))
--------------------------------------------------------------------------------
) end of \k<numbers>
--------------------------------------------------------------------------------
)* end of grouping
--------------------------------------------------------------------------------
] ']'
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.