简体   繁体   English

如何从字符串映射Java中删除/剥离唯一文本

[英]How to remove/stripp off unique text from a Map of Strings Java

I've got a Map<String,list<String>> data structure in which the Key has the following string text: 我有一个Map<String,list<String>>数据结构,其中Key具有以下字符串文本:

/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[69]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[70]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[71]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[72]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[73]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[74]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[75]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[76]/PeriodBeginDate[1]

The index numbers represent the current occurrence of that particular element node. 索引号表示该特定元素节点的当前出现。 However I would like to remove the indexes for the elements for which there is only 1 occurance. 但是,我想删除仅出现1次的元素的索引。

Example: 例:

From: 从:

/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[69]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[70]/PeriodBeginDate[1]

To: 至:

/PayrollFormInfo/CompanyInfo/FederalTotals[69]/PeriodBeginDate
/PayrollFormInfo/CompanyInfo/FederalTotals[70]/PeriodBeginDate

How can I do this in Java 如何在Java中执行此操作

Edit: 编辑:

@Andreas brings up a good point. @Andreas提出了一个观点。 My current implementation for getting an indexed XPath is here: 我当前获得索引XPath的实现在这里:

public String getFullXPath(Node n) {
    if (null == n)
        return null;

    Node parent;
    Stack<Node> hierarchy = new Stack<Node>();
    StringBuilder builder = new StringBuilder();

    hierarchy.push(n);

    switch (n.getNodeType()) {
        case Node.ATTRIBUTE_NODE:
            parent = ((Attr) n).getOwnerElement();
            break;
        case Node.ELEMENT_NODE:
            parent = n.getParentNode();
            break;
        case Node.DOCUMENT_NODE:
            parent = n.getParentNode();
            break;
        default:
            throw new IllegalStateException("Unexpected Node type" + n.getNodeType());
    }

    while (null != parent
            && parent.getNodeType() != Node.DOCUMENT_NODE
            && !parent.getNodeName().equals("section")) {
        hierarchy.push(parent);
        parent = parent.getParentNode();
    }

    Object obj;
    while (!hierarchy.isEmpty() && null != (obj = hierarchy.pop())) {
        Node node = (Node) obj;

        if (node.getNodeType() == Node.ELEMENT_NODE) { 
            builder.append("/").append(node.getNodeName());

            int prev_siblings = 1;
            Node prev_sibling = node.getPreviousSibling();

            while (null != prev_sibling) {
                if (prev_sibling.getNodeType() == node.getNodeType()) {
                    if (prev_sibling.getNodeName().equalsIgnoreCase(node.getNodeName())) {
                        prev_siblings++;
                    }
                }
                prev_sibling = prev_sibling.getPreviousSibling();
            }
            builder.append("[").append(prev_siblings).append("]");
        } 

        else if (node.getNodeType() == Node.ATTRIBUTE_NODE) {
            builder.append("/@");
            builder.append(node.getNodeName());
        }
    }

    return builder.toString();
}

Edit: 编辑:

Added if() {} else conditional as per @Andreas solution but I am not getting my output with indexes as all [1]'s? 根据@Andreas解决方案添加了if(){} else有条件,但是我没有得到索引全为[1]的输出?

        if (node.getNodeType() == Node.ELEMENT_NODE) { 
            builder.append("/").append(node.getNodeName());

            int prev_siblings = 1;
            Node prev_sibling = node.getPreviousSibling();

            while (null != prev_sibling) {
                if (prev_sibling.getNodeType() == node.getNodeType()) {
                    if (prev_sibling.getNodeName().equalsIgnoreCase(node.getNodeName())) {
                        prev_siblings++;
                    }
                }
                prev_sibling = prev_sibling.getPreviousSibling();
            }
            // edit Outside the while loop
            if(prev_siblings == 1 && node.getNextSibling() == null) {
                continue;
            } else builder.append("[").append(prev_siblings).append("]");
        } 

Now my output is: 现在我的输出是:

/PayrollFormInfo/PaidPreparerInfo[1]/Address1[1]
/PayrollFormInfo/PaidPreparerInfo[1]/City[1]
/PayrollFormInfo/PaidPreparerInfo[1]/State[1]
/PayrollFormInfo/PaidPreparerInfo[1]/Zip[1]

Looks like your method works but only for the root element. 看起来您的方法有效,但仅适用于根元素。

One approach could be to remove the ' 1 ' from the string before placing it in the Map. 一种方法可能是在将字符串“ 1 ”放入地图之前将其删除。 Something like this 像这样

Map<String,list<String>> myMap = new HashMap<String,list<String>>();
//lets say variable 'k' has the key and another list v which has value
String k = "/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[69]/PeriodBeginDate[1]";
myMap.put(k.replace("[1]",""), v);

Further details on replace function: String replace() method 有关replace函数的更多详细信息: 字符串replace()方法

First, you'd have to identify that you only have 1 occurrence. 首先,您必须确定只有1次出现。 Example: 例:

/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[70]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[71]/PeriodBeginDate[1]
/PayrollFormInfo[1]/CompanyInfo[1]/FederalTotals[71]/PeriodBeginDate[2]

Here, PeriodBeginDate for FederalTotals 70 is singular, but PeriodBeginDate for FederalTotals 71 is not. 在此,FederalTotals 70的PeriodBeginDate是单数,而FederalTotals 71的PeriodBeginDate不是单数。

You current storage mechanism makes it extremely difficult to know which can be "shortened". 您当前的存储机制使您很难知道哪些可以“缩短”。

You need to build a hierarchy structure for your keys, so you can check the "child count" of any node in the hierarchy. 您需要为键构建一个层次结构,以便可以检查层次结构中任何节点的“子计数”。

Rather than adding the number when building your first map, build up your data using the new hierarchy structure, and construct the "path" when needed. 不用在构建第一个地图时添加数字,而是使用新的层次结构构建数据,并在需要时构建“路径”。

Using regex and replaceAll: 使用正则表达式和replaceAll:

str = str.replaceAll("\\[1\\]", "");


Userful links: 有用的链接:

Regex Java Tester Online 正则表达式Java测试器在线

JavaSE String.replaceAll JavaSE String.replaceAll

With the addition of getFullXPath to the question, the solution is fairly simple. 通过在问题中添加getFullXPath ,解决方案非常简单。

The method is counting the number of "previous siblings" (+1) to know the number to assign. 该方法计算“上一个同级”(+1)的数量以知道要分配的数量。 If that number is 1, check if it has a "next sibling", and don't add the number if it doesn't. 如果该数字为1,请检查该数字是否具有“下一个同级兄弟”,如果没有,则不要添加该数字。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM