简体   繁体   English

将 key=value 的字符串解析为 Map

[英]Parse a String of key=value to a Map

I'm using an API that gives me a XML and I need to get a map from one tag which is actually a string.我正在使用一个 API,它为我提供了一个 XML,我需要从一个实际上是一个字符串的标签中获取一张地图。 Example:例子:

Having拥有

Billable=7200,Overtime=false,TransportCosts=20$

I need我需要

["Billable"="7200","Overtime=false","TransportCosts"="20$"]

The problem is that the string is totally dynamic, so, it can be like问题是字符串是完全动态的,所以,它可以像

Overtime=true,TransportCosts=one, two, three
Overtime=true,TransportCosts=1= 1,two, three,Billable=7200

So I can not just split by comma and then by equal sign.所以我不能只用逗号然后用等号分割。 Is it possible to convert a string like those to a map using a regex?是否可以使用正则表达式将像这样的字符串转换为地图?

My code so far is:到目前为止我的代码是:

private Map<String, String> getAttributes(String attributes) {
    final Map<String, String> attr = new HashMap<>();
    if (attributes.contains(",")) {
        final String[] pairs = attributes.split(",");
        for (String s : pairs) {
            if (s.contains("=")) {
                final String pair = s;
                final String[] keyValue = pair.split("=");
                attr.put(keyValue[0], keyValue[1]);
            }
        }
        return attr;
    }
    return attr;
}

Thank you in advance先感谢您

You may use您可以使用

(\w+)=(.*?)(?=,\w+=|$)

See the regex demo .请参阅正则表达式演示

Details细节

  • (\\w+) - Group 1: one or more word chars (\\w+) - 第 1 组:一个或多个单词字符
  • = - an equal sign = - 等号
  • (.*?) - Group 2: any zero or more chars other than line break chars, as few as possible (.*?) - 第 2 组:除换行符以外的任何零个或多个字符,尽可能少
  • (?=,\\w+=|$) - a positive lookahead that requires a , , then 1+ word chars, and then = , or end of string immediately to the right of the current location. (?=,\\w+=|$) - 一个正向前瞻,需要一个, ,然后是 1+ 个字字符,然后是= ,或者紧接在当前位置右侧的字符串结尾。

Java code:爪哇代码:

public static Map<String, String> getAttributes(String attributes) {
    Map<String, String> attr = new HashMap<>();
    Matcher m = Pattern.compile("(\\w+)=(.*?)(?=,\\w+=|$)").matcher(attributes);
    while (m.find()) {
        attr.put(m.group(1), m.group(2));
    }
    return attr;
}

Java test : Java测试

String s = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
Map<String,String> map = getAttributes(s);
for (Map.Entry entry : map.entrySet()) {
    System.out.println(entry.getKey() + "=" + entry.getValue());
}

Result:结果:

Overtime=true
Billable=7200
TransportCosts=1= 1,two, three

First thing I noticed is that a delimiter is not easily identifiable with the data you're giving, but what appears to be identifiable is that a comma followed by a capital letter separates each field.我注意到的第一件事是使用您提供的数据不容易识别分隔符,但似乎可以识别的是逗号后跟大写字母分隔每个字段。

This allows for an approach to change the delimiter to something that easily identifiable with regex using String.replaceAll("(?<=,)([AZ])", ",$1") .这允许使用String.replaceAll("(?<=,)([AZ])", ",$1")将分隔符更改为易于识别的内容的方法。 Now you'll have a delimiter that you can identify (,,) and split the data to insert the quotes where needed.现在您将拥有一个分隔符,您可以使用它来识别(,,)并拆分数据以在需要的地方插入引号。

Something like:就像是:

public class StackOverflow {
    public static void main(String[] args) {
        String [] data = {
                "Overtime=true,TransportCosts=one, two, three",
                "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200"
        };

        for (int i = 0; i < data.length; i++) {
            data[i] = data[i].replaceAll("(?<=,)([A-Z])", ",$1");
            String[] pieces = data[i].split(",,");
            for (int j = 0; j < pieces.length; j++) {
                int equalIndex = pieces[j].indexOf("=");
                StringBuilder sb = new StringBuilder(pieces[j]);
                // Insert quotes around the = sign
                sb.insert(equalIndex, "\"");
                sb.insert(equalIndex + 2, "\"");
                // Insert quotes at the beginning and end of the string
                sb.insert(0, "\"");
                sb.append("\"");
                pieces[j] = sb.toString();              
            }

            // Join the pieces back together delimited by a comma
            data[i] = String.join(",", pieces);
            System.out.println(data[i]);
        }
    }
}

Results结果

"Overtime"="true","TransportCosts"="one, two, three"
"Overtime"="true","TransportCosts"="1= 1,two, three","Billable"="7200"

Alternative, IMHO simpler regex: ([^,]+=[^=]+)(,|$)替代,恕我直言更简单的正则表达式: ([^,]+=[^=]+)(,|$)

([^,]+=[^=]+) → Groups of: anything but a comma, followed by 1 equals sign, followed by anything but an equals sign... ([^,]+=[^=]+) → 组:除逗号外的任何内容,后跟 1 个等号,后跟除等号外的任何内容...
(,|$) → ... separated by either a comma or end-of-line (,|$) → ... 用逗号或行尾分隔

Tests:测试:

public static void main(String[] args) {
    Pattern pattern = Pattern.compile("([^,]+=[^=]+)(,|$)");

    String test1 = "abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982";
    System.out.println("Test 1: "+test1);
    Matcher matcher = pattern.matcher(test1);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
    System.out.println();
    String test2 = "Overtime=true,TransportCosts=1= 1,two, three,Billable=7200";
    System.out.println("Test 2: "+test2);
    matcher = pattern.matcher(test2);
    while (matcher.find()) {
        System.out.println(matcher.group(1));
    }
}

Output:输出:

Test 1: abc=def,jkl,nm=ghi,egrh=jh=22,kdfka,92,kjasd=908@0982
abc=def,jkl
nm=ghi
egrh=jh=22,kdfka,92
kjasd=908@0982

Test 2: Overtime=true,TransportCosts=1= 1,two, three,Billable=7200
Overtime=true
TransportCosts=1= 1,two, three
Billable=7200

I saw this code using Guava我用番石榴看到了这段代码

import com.google.common.base.Splitter;


/**
 *  parse string 'prop1=val1; prop2=val2' to map
 */
 public static Map<String, String> parseMap(final String keyValueString) {
     if (StringUtils.isEmpty(keyValueString)) return Collections.emptyMap();

      return Splitter.on(";")
            .trimResults()
            .withKeyValueSeparator('=')
            .split(keyValueString);
}

One note, Idea shows a warning because Splitter is annotated with com.google.common.annotations.Beta It is not bad, but can require some working during the guava library version update.需要注意的是,Idea 显示了一个警告,因为Splitter是用com.google.common.annotations.Beta注释的,这还不错,但在 guava 库版本更新期间可能需要一些工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM