简体   繁体   中英

Regex : Looking for dots in a sentence except inside braquets

I'm looking for a regex to split a java string on "dots" in a sentence except if these dots are between brackets. This is to say that in this example sentence :

word1.word2.word3[word4.word5[word6.word7]].word8

I would like to split only the first two ones and the last one (just before "word8").

I managed to get to this regex :

\.(?![^\[]*?\])

But it's not good enough as it also splits on the dot between words 4 and 5 :-(

Any idea to solve this particuliar case ?

By looking at PerlMonks discussions I don't think the problem can be solved in Java by a single regex.

If you are okay with using multiple steps, then you could first remove all pairs of brackets (starting with the innermost) and then split the remaining string by dots:

public static void main (String[] args) {

    String str = "word1.word2.word3[word4.word5[word6.word7]].word8";
    final Pattern BRACKET_PAIR = Pattern.compile("\\[[^\\[\\]]+\\]");

    while (BRACKET_PAIR.matcher(str).find()) {
        str = BRACKET_PAIR.matcher(str).replaceFirst("");
    }

    for (String word: str.split("\\.")) {
        System.out.println(word);
    }
}

Resulting in the output :

word1
word2
word3
word8

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM