简体   繁体   中英

How do I parse XML content that may or may not have a namespace?

I need to parse some XML content for which I have the XSD. In general, this is straight-forward. However, in one particular case, the XML sometimes includes the XML namespace and sometimes it does not. Further, it is not really practical to require the XML namespace, as the supplied XML comes from multiple sources. So I'm stuck with trying to find a way around this.

As noted, I have the XSD for the XML and I have used XJC (from JAXB) to generate the corresponding XML entity classes from the XSD.

Sample XML including the namespace:

<?xml version="1.0" encoding="UTF-8"?>
<root xmlns="http://www.w3.org/namespace/">
    <foo id="123>
        <bar>value</bar>
    </foo>
</root>

Sample XML excluding the namespace:

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <foo id="123>
        <bar>value</bar>
    </foo>
</root>

As you can see, the XML content is identical in structure - the only difference is the xmlxs attribute on the root entity.

My code is as follows:

URI uri = <URI of XML file>
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
Node node = builder.parse(uri.toString()); // Parsing succeeds, ie. the XML is valid.
JAXBContext context = JAXBContext.newInstance("com.example.xml");
Unmarshaller parser = context.createUnmarshaller();
// Next line succeeds or fails, depending on presence of namespace
Object object = parser.unmarshal(node);

The XML is always successfully parsed into a Node . If the xmlns attribute is present in the XML, then the entire process completes normally and I receive an instance of a com.example.xml.Root class (which was generated using XJC). From there I can access the Foo and Bar objects.

If the xmlns attribute is absent, then the unmarshalling fails with the following exception:

javax.xml.bind.UnmarshalException: unexpected element (uri:"", local:"root").
    Expected elements are <{http://www.w3.org/namespace/}root>,
    <{http://www.w3.org/namespace/}foo>,
    <{http://www.w3.org/namespace/}bar>

I tried unmarmshalling by declared type with limited success. Specifically, the unmarshalling completed without error. However, the resulting Root class did not contain any Foo or Bar objects.

The code for this involves changing the last line to:

Object object = parser.unmarshal(node, Root.class);

I tried unmarshalling with the "namespace aware" flag set to false , but this failed with an error.

I've thought about adding a namespace to the node if it does not have one, prior to unmarshalling. However the API does not seem to permit this.

Another thought I had was to have two sets of generated classes, one for each case (ie. namespace, no namespace). However this seems like too much of a kludge.

So I'm stuck? Any suggestions? Or is what I'm trying to do impossible?

You can do with an XML Filter. Here is my example for you, to remove the ns where it's present.

package testjaxb;

import java.io.StringReader;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Unmarshaller;
import javax.xml.transform.sax.SAXSource;
import org.xml.sax.Attributes;
import org.xml.sax.InputSource;
import org.xml.sax.SAXException;
import org.xml.sax.XMLReader;
import org.xml.sax.helpers.XMLFilterImpl;
import org.xml.sax.helpers.XMLReaderFactory;

public class MarshalWithFilter {

    public static void main(String[] args) throws Exception {
        String xmlString = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
                + "<root xmlns=\"http://www.w3.org/namespace/\">\n"
                + "    <foo id=\"123\">\n"
                + "        <bar>value</bar>\n"
                + "    </foo>\n"
                + "</root>";

        String xmlStringWithoutNs = "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
                + "<root>\n"
                + "    <foo id=\"123\">\n"
                + "        <bar>value</bar>\n"
                + "    </foo>\n"
                + "</root>";

        Root r = (Root) unmarshal(xmlString);
        System.out.println("root.." + r.getFoo().getId());
        System.out.println("root.." + r.getFoo().getBar());
        r = (Root) unmarshal(xmlStringWithoutNs);
        System.out.println("root.." + r.getFoo().getId());
        System.out.println("root.." + r.getFoo().getBar());
    }

    private static Root unmarshal(String sampleXML) throws Exception {
        JAXBContext jc = JAXBContext.newInstance(Root.class);
        Unmarshaller unmarshaller = jc.createUnmarshaller();
        XMLReader reader = XMLReaderFactory.createXMLReader();
        IngoreNamespaceFilter nsFilter = new IngoreNamespaceFilter();
        nsFilter.setParent(reader);
        StringReader stringReader = new StringReader(sampleXML);
        InputSource is = new InputSource(stringReader);
        SAXSource source = new SAXSource(nsFilter, is);
        System.out.println("" + sampleXML);
        return (Root) unmarshaller.unmarshal(source);
    }
}

class IngoreNamespaceFilter extends XMLFilterImpl {

    public IngoreNamespaceFilter() {
        super();
    }

    @Override
    public void startDocument() throws SAXException {
        super.startDocument();
    }

    @Override
    public void startElement(String arg0, String arg1, String arg2,
            Attributes arg3) throws SAXException {

        super.startElement("", arg1, arg2, arg3); //Null uri
    }

    @Override
    public void endElement(String arg0, String arg1, String arg2)
            throws SAXException {

        super.endElement("", arg1, arg2); //null url
    }

    @Override
    public void startPrefixMapping(String prefix, String url)
            throws SAXException {
        //ignore namessopace

    }

}

And below are Pojos:

Root

package testjaxb;

import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlRootElement;

@XmlRootElement(name="root")
@XmlAccessorType(XmlAccessType.FIELD)
public class Root
{
    private Foo foo;


    public Foo getFoo ()
    {
        return foo;
    }

    public void setFoo (Foo foo)
    {
        this.foo = foo;
    }


}

Foo

package testjaxb;

import javax.xml.bind.annotation.XmlAccessType;
import javax.xml.bind.annotation.XmlAccessorType;
import javax.xml.bind.annotation.XmlAttribute;


@XmlAccessorType(XmlAccessType.FIELD)
public class Foo
{
    @XmlAttribute
    private String id;

    private String bar;

    public String getId ()
    {
        return id;
    }

    public void setId (String id)
    {
        this.id = id;
    }

    public String getBar ()
    {
        return bar;
    }

    public void setBar (String bar)
    {
        this.bar = bar;
    }


}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM