My Java program, in certain point, receives a string containing a couple of key-value properties like this example:
param1=value Param2=values can have spaces PARAM3=values cant have equal characters
The parameters' name/key are composed by a single word (az, AZ, _ and 0-9) and are followed by an =
character (not separated by spaces) and it's value. The value is a text that can contain spaces and last until the end of the string or the begin of another parameter. (which is a word followed by equals and it's value, etc.)
I need to extract a Properties
object (string-to-string map) from this string. I was trying to use regex to find each key-value set. The code is like this:
public static String createProperties(String str) {
Properties prop = new Properties();
Matcher matcher = Pattern.compile(some regex).match(str);
while (matcher.find()) {
String match = matcher.group();
String param = ...; // What comes before '='
String value = ...; // What comes after '='
prop.setProperty(param, value);
}
return prop;
}
But the regex wrote is not working correctly.
String regex = "(\\w+=.*)+";
Since .*
tells the regex to get "anything" it found, it will match the entire string. I want to tell the regex to search until it finds another \\\\w=.*
. (word followed by equals and something after)
How could I write this regex? Or what would be another solution for the problem using regex?
You can use a Negative Lookahead here.
(\\w+)=((?:(?!\\s*\\w+=).)*)
The key is placed inside capturing group #1
and the value is in capturing group #2
. Note that I used \\s
inside the lookaround in order to prevent the value from having trailing whitespace.
One way among several:
List<String> paramNames = new ArrayList<String>();
List<String> paramValues = new ArrayList<String>();
Pattern regex = Pattern.compile("([^\\s=]+)=([^\\s=]+)");
Matcher regexMatcher = regex.matcher(subjectString);
while (regexMatcher.find()) {
paramNames.add(regexMatcher.group(1));
paramValues.add(regexMatcher.group(2));
}
The regex:
([^\\s=]+)=([^\\s=]+)
The code retrieves keys as Group 1, values as Group 2.
Explanation
([^\\\\s=]+)
captures any chars that are not a whitespace or an equal to Group 1 =
matches the literal =
([^\\\\s=]+)
captures any chars that are not a whitespace or an equal to Group 2 Your regex would be,
(\\w+=(?:(?!\\w+=).)*)
It captures the param=value
pair upto the next param=
. It captures three param=value
pair into three separate groups.
Explanation:
\\\\w+=
Matches one or more word characters followed by an =
symbol. (?:(?!\\\\w+=).)*
A non-capturing group and a negative lookahead is used to match any characters not of characters in this \\w+=
format. So it captures upto the next param=
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.