简体   繁体   English

从字符串中解析重复组

[英]Parsing repeating groups from a string

I'm currently taking in a string which is being dividing我目前正在接受一个正在分割的字符串

20004=1~^20005=0~^773=~^665=~^453=3~^448=0A~!447=D~!452=1~!~^448=0A~!447=D~!452=17~!~^448=81~!447=D~!452=7~!~^11=1116744Pq2Q~^70=15040024-1~^793=MNL-?--1~^467=37878024-1~^60=20110617-05:57:31~^75=20110616~^768=1~^769=20110616-19:17:00~!770=1~!~^55=7800950~^48=AEP~^22=~^454=0~^460=5~^167=TCKR~^

The makeup of the string is such that ~^ divides attributes and ~.字符串的构成使得 ~^ 将属性和 ~ 分开。 marks groups.标记组。 Groups have an attribute preceding them which tells the number of repeating groups such like组之前有一个属性,它告诉重复组的数量,例如

453=3~^448=0A~!447=D~!452=1~!~^448=0A~!447=D~!452=17~!~^448=81~!447=D~!452=7~!~^

Where tag 453 denotes that there are 3 groups.其中标签 453 表示有 3 个组。

I was using a parse such as this:我正在使用这样的解析:

    public Map<Integer, Object> parse(Object target)
{
    String[] elements = ((String) target).split(elementDilimiter);

    Map<Integer, Object> targetFields = new LinkedHashMap<Integer, Object>();

    for(int i=0; i<elements.length; i++)
    {
        String[] attributes = elements[i].split(attributeDelimiter);


        if(attributes.length != 2 || attributes[0].length() == 0 || attributes[1].length() == 0)
        {
            /*throw new ParsingException("Malformed element: " + element + ", expected: tag=value");*/
            continue;
        }
            targetFields.put(Integer.valueOf(attributes[0]), attributes[1]);
    }
    return targetFields;
}

Element delimiter = ~^ and Attribute delimiter = "="元素分隔符 = ~^ 和属性分隔符 = "="

So after the line:所以在这行之后:

String[] elements = ((String) target).split(elementDilimiter);

The Values are split as follows值拆分如下

453=3, 448=0A~!447=D~!452=1~!, 448=0A~!447=D~!452=17~!, 448=81~!447=D~!452=7~!,

These are then split on equals and placed within a map using the tagNo to return a the revelant object.然后将它们分成相等并放置在 map 中,使用 tagNo 返回相关的 object。

However when the groups get to:但是,当小组到达:

String[] attributes = element.split(attributeDelimiter);

The groups go no further due to:组 go 没有进一步的原因是:

attributes.length != 2

But Ideally I would like my implementation to be able to grab the tag 453 realise there is 3 repeating groups, repeating groups go into a parser which will split them delimited on ~.但理想情况下,我希望我的实现能够抓住标签 453 实现有 3 个重复组,重复组 go 到一个解析器中,它将它们分隔为 ~。 and placed within a sub map.并放置在子地图中。

Now I will be honest when I think of implementation my head starts to spin.现在,当我想到实施时,我会说实话,我的头开始旋转。

Is there a simple eligant solution around this or is it basic start from scratch.是否有一个简单的优雅解决方案,或者它是从头开始的基本方法。

EDIT编辑

Is 453 defined to always be the identifier for the number of groups? 453 是否定义为始终作为组数的标识符? yes the tag before the groups merely tells me how many groups there will be.是的,组之前的标签只是告诉我会有多少组。 I have no control of the incoming string or it's format it will take the form as above.我无法控制传入的字符串,或者它的格式将采用上述形式。

What do your groups represent - I ask this one because I would have thought attributes are together as part of groups, but in your method you split on attributes but not groups and you split on elements before attributes?你的组代表什么 - 我问这个是因为我认为属性作为组的一部分在一起,但是在你的方法中,你拆分属性而不是组,并且在属性之前拆分元素? I split the elements so that they can be separated as 453=3 etc however then this group business came in. So now I must rewrite to accomodate them too.我拆分了元素,以便可以将它们分隔为 453=3 等,但是后来这个集团业务进来了。所以现在我也必须重写以适应它们。 This is in essence my problem before hand I had lovely tag values that mapped to a object and could be accessed simply via:这本质上是我之前的问题,我有可爱的标签值映射到 object 并且可以通过以下方式访问:

targetFields.get(TagNumber);

Now I will need to rewrite to enable accessibility to the groups!现在我需要重写以启用对组的可访问性!

I hope this cleared things a bit我希望这能清除一些东西

Use String.split in two steps.分两步使用 String.split。 First split the groups.首先拆分组。 The after split the attributes for each group.之后拆分每个组的属性。

It will solve your problem.它会解决你的问题。

This code will parse out the groups/subgroups.此代码将解析出组/子组。 You can replace the system.print statements with your map building.您可以用您的 map 建筑物替换 system.print 语句。 You may want to rethink the format, however, because it could be a lot clearer if you used a format that naturally supports nesting like XML但是,您可能需要重新考虑格式,因为如果您使用像 XML 这样自然支持嵌套的格式,它可能会更清晰

@Test
public void testname() throws Exception {
    parseText("453=3~^448=0A~!447=D~!452=1~!~^448=0A~!447=D~!452=17~!~^448=81~!447=D~!452=7~!~^");
}

private int subgroupLength = 0;

public void parseText(String text) {
    for (String group : text.split("~\\^")) {
        System.out.println("Group");
        parseGroup(group);
    }
}

public void parseGroup(String group) {
    for (String attribute : group.split("~!"))
        parseAttribute(attribute);
}

public void parseAttribute(String attribute) {
    String[] split = attribute.split("=");
    if (split.length != 2)
        return;

    if (split[0].equals("453")) {
        System.out.println("\tSubgroup length " + split[1]);
        subgroupLength = Integer.parseInt(split[1]);
    } else if (subgroupLength > 0) {
        subgroupLength--;
        System.out.println("\t\t" + split[0] + " = " + split[1]);
    } else
        System.out.println("\t" + split[0] + " = " + split[1]);
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM