简体   繁体   中英

How to split alphabetic strings by dot using regex?

I'd like to split strings by . if the other characters are alphabetic only and the string doesn't start or end with . .

So the expected result for abc.def.xyz would be [abc,def,xyz] .

The following string should be left as they are: abc. xy.a3 1a.ab abc.def,xyz

Basically I'm looking for a more elegant solution to the my current code:

if(canSplit(x)){
   var parts = x.split("\\.");
   ...
}

boolean canSplit(String text) {
    if(text.startsWith(".") || text.endsWith(".")) return false;
    
    for(var s : text.split("\\.")) {
        for(int i = 0; i < s.length(); i++) {
            if(!Character.isAlphabetic(s.charAt(i))) return false;
        }
    }
    return true;        
}

You may use this regex and grab captured group #1

(?:^(?=\p{L}+(?:\.\p{L}+)+$)|(?!^)\G\.)(\p{L}+)

RegEx Demo

Details:

  • (?=\p{L}+(?:\.\p{L}+)+$) ensures we have dot separated alphabets only in a line
  • \G asserts position at the end of the previous match or the start of the string for the first match
  • (?!^) ensures that we don't allow \G to match at the start

Java Code:

jshell> String str = "abc.def.xyz";
str ==> "abc.def.xyz"

jshell> String re = "(?:^(?=\\p{L}+(?:\\.\\p{L}+)+$)|(?!^)\\G\\.)(\\p{L}+)";
re ==> "(?:^(?=\\p{L}+(?:\\.\\p{L}+)+$)|(?!^)\\G\\.)(\\p{L}+)"

jshell> Pattern.compile(re, Pattern.MULTILINE).matcher(str).results().flatMap(mr -> IntStream.rangeClosed(1, mr.groupCount()).mapToObj(mr::group)).collect(Collectors.toList());
$6 ==> [abc, def, xyz]

Nothing wrong with your aproach. But if you want to use regex, your canSplit method could look like:

boolean canSplit(String text) {
    String regex = "[a-z]+(?:\\.[a-z]+)+";
    return text.matches(regex);
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM