繁体   English   中英

捕获方括号内的数字

[英]Capture numbers inside square brackets

我想捕获方括号内的所有数字。 数字用逗号分隔。 例如,我想从文本中捕获 7、8 和 5, some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts. some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.

我尝试使用以下模式

pat = (?<=\[)[\d,\s]*(\d)[\d,\s]*(?=\])

但似乎对于“[7, 8]”的情况,模式是重叠的,我只得到“8”。

恕我直言,使用后向和前瞻是过度使用正则表达式。 您可能最好捕获整个模式,然后切掉第一个和最后一个括号。 这样的事情更容易理解和理解:

import re

sample = r"""
some text [7,8], some other [2, 3] texts with 1 or 2 numbers [5]. [4,
5] other texts
"""

result = [ s[1:-1] for s in re.findall(r'\[\d+\s*(?:,\s*\d+)*\]', sample) ]
print(result)

如果您真的想使用正则表达式来捕获结果,则可以使用:

result = re.findall(r'\[(\d+\s*(?:,\s*\d+)*)\]', sample)
print(result)

使用PyPi 正则表达式

import regex
pat = r'\[(?P<numbers>\d+)(?:,\s*(?P<numbers>\d+))*]'
s = r'some text [7, 8], some other texts with 1 or 2 numbers [5]. other texts.'
results = [match.captures('numbers') for match in regex.finditer(pat, s)]
print(results)

参见Python 证明

结果[['7', '8'], ['5']]

表达解释

--------------------------------------------------------------------------------
  \[                       '['
--------------------------------------------------------------------------------
  (?P<numbers>               group and capture to "numbers":
--------------------------------------------------------------------------------
    \d+                      digits (0-9) (1 or more times (matching
                             the most amount possible))
--------------------------------------------------------------------------------
  )                        end of \k<numbers>
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (0 or more times
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    ,                        ','
--------------------------------------------------------------------------------
    \s*                      whitespace (\n, \r, \t, \f, and " ") (0
                             or more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    (?P<numbers>               group and capture to "numbers":
--------------------------------------------------------------------------------
      \d+                      digits (0-9) (1 or more times
                               (matching the most amount possible))
--------------------------------------------------------------------------------
    )                        end of \k<numbers>
--------------------------------------------------------------------------------
  )*                       end of grouping
--------------------------------------------------------------------------------
  ]                        ']'

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM