I am trying to find key, value pairs in a string using regex ( not sure if it is wise!) here is my string :
key1=key1 value key2=key2 value_key3=something key3=key3_value
key1
, key2
, key3
are keys. As you can see, the values can have spaces, and wait... If you see value of key2
, it has key3
in it ( key2 value_**key3**=something
)! Sorry this is how my input is.
Its not over yet. I can have the keys in any order like below :
key3=key3_value key1=key1 value key2=key2 value_key3=something
key2=key2 value_key3=something key1=key1 value key3=key3_value
Now I want to have a regex that finds me right groups for keys, values so I can later build key value pairs like :
key1=key1 value
key2=key2 value_key3=something
key3=key3_value
I tried the regex key1=(.*)key2=(.*)key3=(.*)
, but that works for only first string. If I change the order of keys as in 2nd and 3rd strings , its gone!
Do each key separately:
String key1value = input.replaceAll(".*\\bkey1= *(\\S+).*", "$1");
// similar for other keys
This extracts everything not a space after "key1=". The key3 example in a value is handled due to the word boundary \\b
required before key's start.
This might get you started:
(\w+)=((?:(?!\bkey\w+=).)+)
See a demo on regex101.com .
In my opinion, the distinction between key2=key2 value_key3=something
and key2=key2 value_key3=something
will be the most difficult.
For a better answer, please provide some real input strings.
Maybe this will help you:
\b([a-z\d]+)=(.*?)(?=\b[a-z\d]+=|$)
It's dependant on keys being constructed by alpha-numeric only though. If keys can contain underscores, as the value does in your example, it fails. :( And if keys can contain capital letters, the ignore case flag must be set.
What it does is to capture a key (letters and numbers allowed), match a =
and then capture everything up to a new key, or end of line.
After some serious thought, this is indeed solvable, a little tricky :
The most important problem I was facing was the order of keys otherwise the regex key1=(.*)key2=(.*)key3=(.*)
would have been sufficient.
So I first got the order of keys by collecting them by using Java's indexOf
Then I construct the regex runtime using that order , code below:
List<String> myPropKeys = new ArrayList<String>();
myPropKeys.add("key1");
myPropKeys.add("key2");
myPropKeys.add("key3");
String input1 = "key1=key1 value key2=key2 value_key3=something key3=key3_value";
String input2 = "key3=key3_value key1=key1 value key2=key2 value_key3=something";
String input3 = "key2=key2 value_key3=something key1=key1 value key3=key3_value";
Map<String, String> propMap = getPropValues(input1, myPropKeys);
propMap = getPropValues(input2, myPropKeys);
propMap = getPropValues(input3, myPropKeys);
System.out.println();
private static Map<String, String> getPropValues( String input, List<String> myPropKeys )
{
Map<String, String> propValues = new HashMap<String, String>();
StringTokenizer tokens = new StringTokenizer( input );
List<String> propKeyList = new ArrayList<String>();
while( tokens.hasMoreTokens() )
{
String token = tokens.nextToken();
int equalsIndex = token.indexOf( "=" );
if( equalsIndex != -1 )
{
String propertyToken = token.substring( 0, equalsIndex );
if (myPropKeys.contains(propertyToken))
{
propKeyList.add( propertyToken );
}
}
}
StringBuilder sb = new StringBuilder();
for ( String propKey : propKeyList )
{
sb.append( propKey + "=" );
sb.append( "(.*)" );
}
Pattern p = Pattern.compile(sb.toString());
Matcher m = p.matcher(input);
List<String> values = new ArrayList<String>();
if (m.find())
{
for ( int i = 1; i <= propKeyList.size(); i++ )
{
values.add(m.group(i));
}
}
if ( propKeyList.size() == values.size() )
{
for ( int i = 0; i < propKeyList.size(); i++ )
{
propValues.put( propKeyList.get(i), values.get(i).trim() );
}
}
return propValues;
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.