简体   繁体   English

正则表达式在不在括号内的点上拆分字符串

[英]Regex to split string on dots that are not inside brackets

As the title says, I want to write a regex r such that Splitter.onPattern(r).splitToList("abc.d[efg]");正如标题所说,我想写一个正则表达式 r 这样Splitter.onPattern(r).splitToList("abc.d[efg]"); results in结果是

[a, b, c, d[e.f.g]]

I have been playing around trying to get it right, but can't figure it out.我一直在玩试图让它正确,但无法弄清楚。 I thought "\\.((?!\\[)*)\\]*" should have worked (to match any dot that is followed by a string not containing '[' that ends with ']'), but it still splits on all dots for some reason.我认为"\\.((?!\\[)*)\\]*"应该可以工作(匹配后跟不包含以 ']' 结尾的 '[' 的字符串的任何点),但它由于某种原因,仍然在所有点上分裂。

With your shown samples please try following, also assuming that [ and ] are balanced and not nested.对于您显示的示例,请尝试以下操作,同时假设[]是平衡的并且没有嵌套。

\.(?![^[]*])

Here is the Online demo of regex这是正则表达式的在线演示

Explanation: Match a dot not followed by 0 or more characters without a [ and followed by ].说明:匹配一个点,后面不跟 0 个或多个不带 [ 且后面跟 ] 的字符。

The trick is to assert that the next bracket char, if there is one, is not a closing bracket.诀窍是断言下一个括号字符(如果有的话)不是右括号。

In regex speak, this is expressed as "is not followed by any number of non-brackets then a ] ":在正则表达式中,这表示为“后面没有任何数量的非括号,然后是] ”:

"\\.(?![^\\[\\]]*])"

See live demo .现场演示


Note: This does not work for nested dotted expressions like ab[cd[ef].g].h .注意:这不适用于像ab[cd[ef].g].h这样的嵌套点分表达式。

Extract them with提取它们

[^\[\].]+(?:\[[^\[\]]*])?

See proof .证明

EXPLANATION解释

--------------------------------------------------------------------------------
  [^\[\].]+                any character except: '\[', '\]', '.' (1
                           or more times (matching the most amount
                           possible))
--------------------------------------------------------------------------------
  (?:                      group, but do not capture (optional
                           (matching the most amount possible)):
--------------------------------------------------------------------------------
    \[                       '['
--------------------------------------------------------------------------------
    [^\[\]]*                 any character except: '\[', '\]' (0 or
                             more times (matching the most amount
                             possible))
--------------------------------------------------------------------------------
    ]                        ']'
--------------------------------------------------------------------------------
  )?                       end of grouping

Java example code : Java 示例代码

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class Main {
    public static void main(String[] args) {
        final String regex = "[^\\[\\].]+(?:\\[[^\\[\\]]*])?";
        final String string = "a.b.c.d[e.f.g]";
        
        final Pattern pattern = Pattern.compile(regex);
        final Matcher matcher = pattern.matcher(string);

        while (matcher.find()) {
            System.out.println(matcher.group(0));
        }
    }
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM