简体   繁体   中英

Splitting string into two strings with regex

This question was asked several times before but I couldn't find an answer to my question: I need to split a string into two strings. First part is date and the second string is text. This is what i got so far:

String test = "24.12.17 18:17 TestString";
String[] testSplit = test.split("\\d{2}.\\d{2}.\\d{2} \\d{2}:\\d{2}");
System.out.println(testSplit[0]);           // "24.12.17 18:17" <-- Does not work
System.out.println(testSplit[1].trim());    // "TestString" <-- works

I can extract "TestString" but i miss the date. Is there any better (or even simpler) way? Help is highly appreciated!

Skip regex; Use three strings

You are working too hard. No need to include the date and the time together as one. Regex is tricky, and life is short.

Just use the plain String::split for three pieces, and re-assemble the date-time.

String[] pieces = "24.12.17 18:17 TestString".split( " " ) ;  // Split into 3 strings.
LocalDate ld = LocalDate.parse( pieces[0] , DateTimeFormatter.ofPattern( "dd.MM.uu" ) ) ;  // Parse the first string as a date value (`LocalDate`).
LocalTime lt = LocalTime.parse( pieces[1] , DateTimeFormatter.ofPattern( "HH:mm" ) ) ;  // Parse the second string as a time-of-day value (`LocalTime`).
LocalDateTime ldt = LocalDateTime.of( ld , lt ) ;  // Reassemble the date with the time (`LocalDateTime`).
String description = pieces[2] ;  // Use the last remaining string. 

See this code run live at IdeOne.com .

ldt.toString(): 2017-12-24T18:17

description: TestString

Tip: If you have any control over that input, switch to using standard ISO 8601 formats for date-time values in text. The java.time classes use the standard formats by default when generating/parsing strings.

You want to match only the separator . By matching the date, you consume it (it's thrown away).

Use a look behind , which asserts but does not consume:

test.split("(?<=^.{14}) ");

This regex means "split on a space that is preceded by 14 characters after the start of input".


Your test code now works:

String test = "24.12.17 18:17 TestString";
String[] testSplit = test.split("(?<=^.{14}) ");
System.out.println(testSplit[0]);           // "24.12.17 18:17" <-- works
System.out.println(testSplit[1].trim());    // "TestString" <-- works

If your string is always in this format (and is formatted well), you do not even need to use a regex. Just split at the second space using .substring and .indexOf :

String test = "24.12.17 18:17 TestString";
int idx = test.indexOf(" ", test.indexOf(" ") + 1);
System.out.println(test.substring(0, idx));
System.out.println(test.substring(idx).trim());

See the Java demo .

If you want to make sure your string starts with a datetime value, you may use a matching approach to match the string with a pattern containing 2 capturing groups: one will capture the date and the other will capture the rest of the string:

String test = "24.12.17 18:17 TestString";
String pat = "^(\\d{2}\\.\\d{2}\\.\\d{2} \\d{2}:\\d{2})\\s(.*)";
Matcher matcher = Pattern.compile(pat, Pattern.DOTALL).matcher(test);
if (matcher.find()) {
    System.out.println(matcher.group(1));
    System.out.println(matcher.group(2).trim());
}

See the Java demo .

Details :

  • ^ - start of string
  • (\\\\d{2}\\\\.\\\\d{2}\\\\.\\\\d{2} \\\\d{2}:\\\\d{2}) - Group 1: a datetime pattern ( xx.xx.xx xx:xx -like pattern)
  • \\\\s - a whitespace (if it is optional, add * after it)
  • (.*) - Group 2 capturing any 0+ chars up to the end of string ( . will match line breaks, too, because of the Pattern.DOTALL flag).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM