简体   繁体   中英

Parse string using Java Regex Pattern?

I have the below java string in the below format.

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:"

Using the java.util.regex package matter and pattern classes I have to get the output string int the following format:

Output: [NYK:1100][CLT:2300][KTY:3540]

Can you suggest a RegEx pattern which can help me get the above output format?

You can use this regex \\[name:([AZ]+)\\]\\[distance:(\\d+)\\] with Pattern like this :

String regex = "\\[name:([A-Z]+)\\]\\[distance:(\\d+)\\]";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(s);

StringBuilder result = new StringBuilder();
while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group(1));
    result.append(":");
    result.append(matcher.group(2));
    result.append("]");
}

System.out.println(result.toString());

Output

[NYK:1100][CLT:2300][KTY:3540]
  • regex demo
  • \\[name:([AZ]+)\\]\\[distance:(\\d+)\\] mean get two groups one the upper letters after the \\[name:([AZ]+)\\] the second get the number after \\[distance:(\\d+)\\]

Another solution from @tradeJmark you can use this regex :

String regex = "\\[name:(?<name>[A-Z]+)\\]\\[distance:(?<distance>\\d+)\\]";

So you can easily get the results of each group by the name of group instead of the index like this :

while (matcher.find()) {                                                
    result.append("[");
    result.append(matcher.group("name"));
    //----------------------------^^
    result.append(":");
    result.append(matcher.group("distance"));
    //------------------------------^^
    result.append("]");
}

If the format of the string is fixed, and you always have just 3 [...] groups inside to deal with , you may define a block that matches [name:...] and captures the 2 parts into separate groups and use a quite simple code with .replaceAll :

String s = "City: [name:NYK][distance:1100] [name:CLT][distance:2300] [name:KTY][distance:3540] Price:";
String matchingBlock = "\\s*\\[name:([A-Z]+)]\\[distance:(\\d+)]";
String res = s.replaceAll(String.format(".*%1$s%1$s%1$s.*", matchingBlock), 
    "[$1:$2][$3:$4][$5:$6]");
System.out.println(res); // [NYK:1100][CLT:2300][KTY:3540]

See the Java demo and a regex demo .

The block pattern matches:

  • \\\\s* - 0+ whitespaces
  • \\\\[name: - a literal [name: substring
  • ([AZ]+) - Group n capturing 1 or more uppercase ASCII chars ( \\\\w+ can also be used)
  • ]\\\\[distance: - a literal ][distance: substring
  • (\\\\d+) - Group m capturing 1 or more digits
  • ] - a ] symbol.

In the .*%1$s%1$s%1$s.* pattern, the groups will have 1 to 6 IDs (referred to with $1 - $6 backreferences from the replacement pattern) and the leading and final .* will remove start and end of the string (add (?s) at the start of the pattern if the string can contain line breaks).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM