简体   繁体   English

正则表达式:查找括号内的句点

[英]Regex : Looking for dots in a sentence except inside braquets

I'm looking for a regex to split a java string on "dots" in a sentence except if these dots are between brackets. 我正在寻找一个正则表达式来拆分句子中“点”上的Java字符串,除非这些点在方括号之间。 This is to say that in this example sentence : 就是说在这个例句中:

word1.word2.word3[word4.word5[word6.word7]].word8

I would like to split only the first two ones and the last one (just before "word8"). 我只想拆分前两个和最后一个(就在“ word8”之前)。

I managed to get to this regex : 我设法去了这个正则表达式:

\.(?![^\[]*?\])

But it's not good enough as it also splits on the dot between words 4 and 5 :-( 但这还不够好,因为它还会在单词4和5之间的点上分开:-(

Any idea to solve this particuliar case ? 有解决这个特殊情况的想法吗?

By looking at PerlMonks discussions I don't think the problem can be solved in Java by a single regex. 通过查看PerlMonks的讨论,我认为单个正则表达式无法在Java中解决问题。

If you are okay with using multiple steps, then you could first remove all pairs of brackets (starting with the innermost) and then split the remaining string by dots: 如果可以使用多个步骤,则可以先删除所有成对的括号 (从最里面开始),然后用点将其余的字符串分开:

public static void main (String[] args) {

    String str = "word1.word2.word3[word4.word5[word6.word7]].word8";
    final Pattern BRACKET_PAIR = Pattern.compile("\\[[^\\[\\]]+\\]");

    while (BRACKET_PAIR.matcher(str).find()) {
        str = BRACKET_PAIR.matcher(str).replaceFirst("");
    }

    for (String word: str.split("\\.")) {
        System.out.println(word);
    }
}

Resulting in the output : 结果输出

word1
word2
word3
word8

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM