简体   繁体   中英

RegEx to find nested Code Blocks

I'm writing a code formatter and I need some help. I have to find the code blocks and I want to use regular expressions. The code I need to format looks basically like this:

KEYWORD name {
    word
    word
    ...
}

I am able to find the blocks that start with { and end with } with this expression:

[{](.*?)[}]

But I don't know how to add the "KEYWORD name" part to the expression. Both are custom strings that can contain any character except ; , { and } .

Another problem is that my code blocks can be nested. I don't know how to add that feature.

You can just do:

KEYWORD name {.*?}

Since you want the . to match newline as well you'll have to use the multi-line mode.

Since both KEYWORD and name are arbitrary strings that can contain any character except ; , { and } :

[^;,{}]+\s+[^;,{}]+\s*{.*?}

(.+?)\\s+(.+?)\\s+{(.*?)}

This is: Anything that's not a space, followed by one or more whitespace characters, followed by anything that's not a space, one or more whitespace characters, and your code block.

If the KEYWORD can only contain uppercase letters and the name , let's say all letters, digits and underscores, it should look like this:

([A-Z]+?)\s+([A-Za-z0-9_+?)\s+\{(.*?)\}

Note that if your code blocks can be nested, you'll have problems with this regex, as it will match both the first { as well as the first }.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM