简体   繁体   English

在python中使用正则表达式拆分带括号的字符串

[英]Splitting a string with brackets using regular expression in python

Suppose I have a string like str = "[Hi all], [this is] [an example] " .假设我有一个字符串str = "[Hi all], [this is] [an example] " I want to split it into several pieces, each of which consists content inside a pair bracket.我想把它分成几部分,每部分都包含一对括号内的内容。 In another word, i want to grab the phrases inside each pair of bracket.换句话说,我想抓取每对括号内的短语。 The result should be like:结果应该是这样的:

['Hi all', 'this is', 'an example']

How can I achieve this goal using a regular expression in Python?如何使用 Python 中的正则表达式实现此目标?

data = "[Hi all], [this is] [an example] "
import re
print re.findall("\[(.*?)\]", data)    # ['Hi all', 'this is', 'an example']

正则表达式可视化

Debuggex Demo调试器演示

Try this:尝试这个:

import re
str = "[Hi all], [this is] [an example] "
contents = re.findall('\[(.*?)\]', str)

I've run into this problem a few times - Regular expressions will work unless you have nested brackets.我遇到过几次这个问题 - 除非您有嵌套的括号,否则正则表达式将起作用。 In the more general case where you might have nested brackets, the following will work:在您可能有嵌套括号的更一般情况下,以下将起作用:


def bracketed_split(string, delimiter, strip_brackets=False):
    """ Split a string by the delimiter unless it is inside brackets.
    e.g.
        list(bracketed_split('abc,(def,ghi),jkl', delimiter=',')) == ['abc', '(def,ghi)', 'jkl']
    """

    openers = '[{(<'
    closers = ']})>'
    opener_to_closer = dict(zip(openers, closers))
    opening_bracket = dict()
    current_string = ''
    depth = 0
    for c in string:
        if c in openers:
            depth += 1
            opening_bracket[depth] = c
            if strip_brackets and depth == 1:
                continue
        elif c in closers:
            assert depth > 0, f"You exited more brackets that we have entered in string {string}"
            assert c == opener_to_closer[opening_bracket[depth]], f"Closing bracket {c} did not match opening bracket {opening_bracket[depth]} in string {string}"
            depth -= 1
            if strip_brackets and depth == 0:
                continue
        if depth == 0 and c == delimiter:
            yield current_string
            current_string = ''
        else:
            current_string += c
    assert depth == 0, f'You did not close all brackets in string {string}'
    yield current_string
>>> list(bracketed_split("[Hi all], [this is] [an example]", delimiter=' '))
['[Hi all],', '[this is]', '[an example]']

>>> list(bracketed_split("[Hi all], [this is] [a [nested] example]", delimiter=' '))
['[Hi all],', '[this is]', '[a [nested] example]']

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM