简体   繁体   English

Python正则表达式用不同的括号分割字符串

[英]Python regex to split string by different brackets

I have a string that could look like: 我有一个看起来像的字符串:

"set<array<char, [100:140, 40:80]>>"

or 要么

"set<array<struct{int foo, char bar}, [100:150, 50:80]>>"

basically the structure within the "array" could be either a primitive or a struct of primitives, or a struct of structs. 基本上,“数组”中的结构可以是基元或基元的结构,也可以是结构的结构。

Using python regex module, I want to get a return something like this object: 使用python regex模块,我想获得类似此对象的返回值:

{"base_type":"array", "type":"char"}

or, for second one: 或者,第二个:

{"base_type":"array", "type":"struct", "sub_type":["int", "char"]}

Maybe there's a more elegant way to do this without using regular expressions. 也许有一种不使用正则表达式的更优雅的方法。 Any help would be really appreciated. 任何帮助将非常感激。 :) :)

According to these two test cases you provided, I come up with two regex expressions: 根据您提供的这两个测试用例,我提出了两个正则表达式:

  1. "set<array<(char|float|int), .*>>" for these cases with primary types nested. 对于这些嵌套了主要类型的情况, "set<array<(char|float|int), .*>>"
  2. "set<array<struct{((int|char|float)\\s+.*,)*\\s+((int|char|float) .*)}, .*>>" for these cases with struct type nested. 对于嵌套了struct类型的这些情况, "set<array<struct{((int|char|float)\\s+.*,)*\\s+((int|char|float) .*)}, .*>>"

You can use group to find out what primary types are nested and what types are in the struct. 您可以使用group找出嵌套的主要类型以及结构中的哪些类型。

Here is my solution in python: 这是我在python中的解决方案:

# -*- coding: utf-8 -*-
import re

primary_regex = "set<array<(char|float|int), .*>>"
struct_regex = (
    "set<array<struct{((int|char|float)\s+.*,)*\s+((int|char|float) .*)}, .*>>"
)


def extract(define_str):
    m = re.match(primary_regex, define_str)
    result = {
        'base_type': 'array',
    }

    if m is None:
        m = re.match(struct_regex, define_str)

        if m is None:
            # Invalid define_str, return None
            return None

        # Result of m.groups() is a tuple alike
        # ('int foo,', 'int', 'char bar', 'char')
        sub_type = m.groups()[1::2]
        result['type'] = 'struct'
        result['sub_type'] = sub_type
    else:
        primary_type = m.group(1)
        result['type'] = primary_type

    return result

Hope this will help. 希望这会有所帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM