[英]Parse a String of key=value to a Map
I'm using an API that gives me a XML and I need to get a map from one tag which is actually a string.我正在使用一个 API,它为我提供了一个 XML,我需要从一个实际上是一个字符串的标签中获取一张地图。 Example:
例子:
Having拥有
Billable=7200,Overtime=false,TransportCosts=20$
I need我需要
["Billable"="7200","Overtime=false","TransportCosts"="20$"]
The problem is that the string is totally dynamic, so, it can be like问题是字符串是完全动态的,所以,它可以像
Overtime=true,TransportCosts=one, two, three
Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
So I can not just split by comma and then by equal sign.所以我不能只用逗号然后用等号分割。 Is it possible to convert a string like those to a map using a regex?
是否可以使用正则表达式将像这样的字符串转换为地图?
My code so far is:到目前为止我的代码是:
private Map<String, String> getAttributes(String attributes) {
final Map<String, String> attr = new HashMap<>();
if (attributes.contains(",")) {
final String[] pairs = attributes.split(",");
for (String s : pairs) {
if (s.contains("=")) {
final String pair = s;
final String[] keyValue = pair.split("=");
attr.put(keyValue[0], keyValue[1]);
}
}
return attr;
}
return attr;
}
Thank you in advance先感谢您
You may use您可以使用
(\w+)=(.*?)(?=,\w+=|$)
See the regex demo .请参阅正则表达式演示。
Details细节
(\\w+)
- Group 1: one or more word chars (\\w+)
- 第 1 组:一个或多个单词字符=
- an equal sign =
- 等号(.*?)
- Group 2: any zero or more chars other than line break chars, as few as possible (.*?)
- 第 2 组:除换行符以外的任何零个或多个字符,尽可能少(?=,\\w+=|$)
- a positive lookahead that requires a ,
, then 1+ word chars, and then =
, or end of string immediately to the right of the current location. (?=,\\w+=|$)
- 一个正向前瞻,需要一个,
,然后是 1+ 个字字符,然后是=
,或者紧接在当前位置右侧的字符串结尾。 Java code:爪哇代码:
public static Map<String, String> getAttributes(String attributes) {
Map<String, String> attr = new HashMap<>();
Matcher m = Pattern.compile("(\\w+)=(.*?)(?=,\\w+=|$)").matcher(attributes);
while (m.find()) {
attr.put(m.group(1), m.group(2));
}
return attr;
}
String s = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
Map<String,String> map = getAttributes(s);
for (Map.Entry entry : map.entrySet()) {
System.out.println(entry.getKey() + "=" + entry.getValue());
}
Result:结果:
Overtime=true
Billable=7200
TransportCosts=1= 1,two, three
First thing I noticed is that a delimiter is not easily identifiable with the data you're giving, but what appears to be identifiable is that a comma followed by a capital letter separates each field.我注意到的第一件事是使用您提供的数据不容易识别分隔符,但似乎可以识别的是逗号后跟大写字母分隔每个字段。
This allows for an approach to change the delimiter to something that easily identifiable with regex using String.replaceAll("(?<=,)([AZ])", ",$1")
.这允许使用
String.replaceAll("(?<=,)([AZ])", ",$1")
将分隔符更改为易于识别的内容的方法。 Now you'll have a delimiter that you can identify (,,)
and split the data to insert the quotes where needed.现在您将拥有一个分隔符,您可以使用它来识别
(,,)
并拆分数据以在需要的地方插入引号。
Something like:就像是:
public class StackOverflow {
public static void main(String[] args) {
String [] data = {
"Overtime=true,TransportCosts=one, two, three",
"Overtime=true,TransportCosts=1= 1,two, three,Billable=7200"
};
for (int i = 0; i < data.length; i++) {
data[i] = data[i].replaceAll("(?<=,)([A-Z])", ",$1");
String[] pieces = data[i].split(",,");
for (int j = 0; j < pieces.length; j++) {
int equalIndex = pieces[j].indexOf("=");
StringBuilder sb = new StringBuilder(pieces[j]);
// Insert quotes around the = sign
sb.insert(equalIndex, "\"");
sb.insert(equalIndex + 2, "\"");
// Insert quotes at the beginning and end of the string
sb.insert(0, "\"");
sb.append("\"");
pieces[j] = sb.toString();
}
// Join the pieces back together delimited by a comma
data[i] = String.join(",", pieces);
System.out.println(data[i]);
}
}
}
Results结果
"Overtime"="true","TransportCosts"="one, two, three"
"Overtime"="true","TransportCosts"="1= 1,two, three","Billable"="7200"
Alternative, IMHO simpler regex: ([^,]+=[^=]+)(,|$)
替代,恕我直言更简单的正则表达式:
([^,]+=[^=]+)(,|$)
([^,]+=[^=]+)
→ Groups of: anything but a comma, followed by 1 equals sign, followed by anything but an equals sign... ([^,]+=[^=]+)
→ 组:除逗号外的任何内容,后跟 1 个等号,后跟除等号外的任何内容...
(,|$)
→ ... separated by either a comma or end-of-line (,|$)
→ ... 用逗号或行尾分隔
Tests:测试:
public static void main(String[] args) {
Pattern pattern = Pattern.compile("([^,]+=[^=]+)(,|$)");
String test1 = "abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982";
System.out.println("Test 1: "+test1);
Matcher matcher = pattern.matcher(test1);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
System.out.println();
String test2 = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
System.out.println("Test 2: "+test2);
matcher = pattern.matcher(test2);
while (matcher.find()) {
System.out.println(matcher.group(1));
}
}
Output:输出:
Test 1: abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982
abc=def,jkl
nm=ghi
egrh=jh=22,kdfka,92
kjasd=908@0982
Test 2: Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
Overtime=true
TransportCosts=1= 1,two, three
Billable=7200
I saw this code using Guava我用番石榴看到了这段代码
import com.google.common.base.Splitter;
/**
* parse string 'prop1=val1; prop2=val2' to map
*/
public static Map<String, String> parseMap(final String keyValueString) {
if (StringUtils.isEmpty(keyValueString)) return Collections.emptyMap();
return Splitter.on(";")
.trimResults()
.withKeyValueSeparator('=')
.split(keyValueString);
}
One note, Idea shows a warning because Splitter
is annotated with com.google.common.annotations.Beta
It is not bad, but can require some working during the guava library version update.需要注意的是,Idea 显示了一个警告,因为
Splitter
是用com.google.common.annotations.Beta
注释的,这还不错,但在 guava 库版本更新期间可能需要一些工作。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.