简体   繁体   中英

not able to remove tags that “xsi:nil” in them via xslt

I have following xml which contains several xml tags with xsi:nil="true". These are tags that are basically null. I am not able to use/find any sxlt transformer to remove these tags from the xml and obtain the rest of the xml.

<?xml version="1.0" encoding="utf-8"?>
<p849:retrieveAllValues xmlns:p849="http://package.de.bc.a">
    <retrieveAllValues>
        <messages xsi:nil="true" />
        <existingValues>
            <Values>
                <value1> 10.00</value1>
                <value2>123456</value2>
                <value3>1234</value3>
                <value4 xsi:nil="true" />
                <value5 />
            </Values>
        </existingValues>
        <otherValues xsi:nil="true" />
        <recValues xsi:nil="true" />
    </retrieveAllValues>
</p849:retrieveAllValues>

The reason of error you get

[Fatal Error] file2.xml:5:30: The prefix "xsi" for attribute "xsi:nil" associated with an element type "messages" is not bound.

is absence of prefix named "xsi" declared, you should specify it in root element such as:

<p849:retrieveAllValues xmlns:p849="http://package.de.bc.a"
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<retrieveAllValues>
    <messages xsi:nil="true" />
       // other code...

update

If you could not change xml document you're receiving from webservice, you could try next approach(if this approach is acceptable for you):

  1. Change your xslt document to process xml documents without specifying element prefixes
  2. Set property namespaceAware of DocumentBuilderFactory to false

After this yout transformer shouldn't complain

It doesn't look like this is going to be possible in XSLT - because of the missing namespace declarations you have to parse the XML file with a non-namespace-aware parser, but all the XSLT processors I've tried don't get on well with such documents, they must rely on some information that is only present when parsing with namespace awareness enabled, even if the document in question doesn't actually contain any namespaced nodes.

So you'll have to approach it a different way, for example by traversing the DOM tree yourself. Since you say you're working in Java, here's an example using Java DOM APIs (the example runs as-is in the Groovy console, or wrap it up in a proper class definition and add whatever exception handling is required to run it as Java)

import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.parsers.*;
import org.w3c.dom.*;
import org.w3c.dom.ls.*;

public void stripNils(Node n) {
  if(n instanceof Element &&
      "true".equals(((Element)n).getAttribute("xsi:nil"))) {
    // element is xsi:nil - strip it out
    n.getParentNode().removeChild(n);
  } else {
    // we're keeping this node, process its children (if any) recursively
    NodeList children = n.getChildNodes();
    for(int i = 0; i < children.getLength(); i++) {
      stripNils(children.item(i));
    }
  }
}

// load the document (NB DBF is non-namespace-aware by default)
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document xmlDoc = db.parse(new File("input.xml"));

stripNils(xmlDoc);

// write out the modified document, in this example to stdout
LSSerializer ser =
  ((DOMImplementationLS)xmlDoc.getImplementation()).createLSSerializer();
LSOutput out =
  ((DOMImplementationLS)xmlDoc.getImplementation()).createLSOutput();
out.setByteStream(System.out);
ser.write(xmlDoc, out);

On your original example XML this produces the correct result:

<?xml version="1.0" encoding="UTF-8"?>
<p849:retrieveAllValues xmlns:p849="http://package.de.bc.a">
    <retrieveAllValues>

        <existingValues>
            <Values>
                <value1> 10.00</value1>
                <value2>123456</value2>
                <value3>1234</value3>

                <value5/>
            </Values>
        </existingValues>


    </retrieveAllValues>
</p849:retrieveAllValues>

The empty lines are not actually empty, they contain the whitespace text nodes either side of the removed elements, as only the elements themselves are being removed here.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM