I have a string like this
DATA/2019-00-01-23.x
I want to get three tokens Text, Date and Hour
[DATA, 2019-00-01, 23]
I tried this
String x = "DATA/2019-00-01-23.x";
System.out.println(Arrays.toString(x.split("/|-[0-9]+.")))
This returns me
[DATA, 2019, 01, x]
You may actually use a split
like
x.split("/|-(?=[^-]*$)|\\D+$")
See the Java demo , output: [DATA, 2019-00-01, 23]
.
This regex will split at
/
- a slash |
- or -(?=[^-]*$)
- last hyphen in the string |
- or \\D+$
- any 1+ non-digit chars at the end of the string (as String.split(regex)
is run with limit
argument as 0
, these matches at the end of the string do not result in trailing empty items in the resulting array.) You can replace the last part after the dot, then using split with /|(\\-)(?!.*\\-)
:
String[] split = "DATA/2019-00-01-23.x".replaceFirst("\\..*$", "")
.split("/|(\\-)(?!.*\\-)"); // [DATA, 2019-00-01, 23]
I would go with Pattern
and Matcher
and groups like so (.*?)/(.*?)-([^-]+)\\\\..*
:
Pattern pattern = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*");
Matcher matcher = pattern.matcher("DATA/2019-00-01-23.x");
if(matcher.find()){
System.out.println(matcher.group(1)); // DATA
System.out.println(matcher.group(2)); // 2019-00-01
System.out.println(matcher.group(3)); // 23
}
Or by using Java9+ you can use :
String[] result = Pattern.compile("(.*?)/(.*?)-([^-]+)\\..*")
.matcher("DATA/2019-00-01-23.x")
.results()
.flatMap(grps -> Stream.of(grps.group(1), grps.group(2), grps.group(3)))
.toArray(String[]::new);
Outputs
[DATA, 2019-00-01, 23]
Use capturing groups to extract the three parts.
private static final Pattern PATTERN = Pattern.compile("(.+)/([-0-9]+)-([0-9]{1,2})\\..*");
public static void main(String... args) {
Matcher matcher = PATTERN.matcher("DATA/2019-00-01-23.x");
if (matcher.matches() && matcher.groupCount() == 3) {
String text = matcher.group(1);
String date = matcher.group(2);
String hour = matcher.group(3);
System.out.println(text + "\t" + date + '\t' + hour);
}
}
Dissected: (.+)
/
([-0-9]+)
-
([0-9]{2})
\\..*
(.+)
Everything before the /
([-0-9]+)
Numbers, can contain -
-
to prevent the previous part from gobbling up the hour ([0-9]{2})
Two numbers \\..*
A period, then 'the rest'.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.