简体   繁体   中英

Woodstox/XML1.1/XSD Parsing+Validation and XInclude

please help me with my Java/woodstox code down below. I also provide an xsd and two xml files in my example.

Main problem

I turned on validation and would expect a validation error because

  • the IDs foo1 as well as foo2 are defined twice in test2.xml,
  • the ID foo is used without definition in test2.xml (unless the ID from test1.xml is taken into consideration as I would like it to happen using the XInclude), and
  • the ID foo3 is used without definition in test2.xml. However, no validation problem is shown.

test.xsd

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">

    <xs:attributeGroup name="ATTRIBUTES_TYPE_node">
        <xs:attribute name="id1" type="xs:ID" use="required"/>
        <xs:attribute name="id2" type="xs:ID" use="required"/>
        <xs:attribute name="idref" type="xs:IDREF" use="optional"/>
    </xs:attributeGroup>

    <xs:complexType name="TYPE_node">
        <xs:attributeGroup ref="ATTRIBUTES_TYPE_node"/>
    </xs:complexType>

    <xs:complexType name="TYPE_root">
        <xs:sequence>
            <xs:element name="node" type="TYPE_node" minOccurs="0" maxOccurs="unbounded"/>
        </xs:sequence>
    </xs:complexType>

    <xs:element name="root" type="TYPE_root"/>

</xs:schema>

test1.xsd

<?xml version="1.1" encoding="utf-8"?>
<root
        xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning"
        vc:minVersion="1.1"
        vc:noNamespaceSchemaLocation="test.xsd">
    <node id1="foo" id2="bar" idref="foo"/>
</root>

test2.xsd

<?xml version="1.1" encoding="utf-8"?>
<root
        xmlns:vc="http://www.w3.org/2007/XMLSchema-versioning"
        vc:minVersion="1.1"
        xmlns:xi="http://www.w3.org/2001/XInclude"
        vc:noNamespaceSchemaLocation="test.xsd">
    <xi:include href="test1.xml">
        <xi:fallback/>
    </xi:include>

    <node id1="foo1" id2="foo2" idref="foo"/>
    <node id1="foo1" id2="foo2" idref="foo3"/>
</root>

Java code

XMLInputFactory xmlInputFactory = XMLInputFactory2.newInstance();
xmlInputFactory.setProperty(XMLInputFactory.IS_COALESCING, true);
xmlInputFactory.setProperty(XMLInputFactory.IS_VALIDATING, true);
xmlInputFactory.setProperty(XMLInputFactory.IS_NAMESPACE_AWARE, true);
XMLStreamReader xmlStreamReader = xmlInputFactory.createXMLStreamReader(new FileReader("src/main/resources/test2.xml"));

try {
    xmlStreamReader.nextTag();
    xmlStreamReader.require(XMLStreamConstants.START_ELEMENT, null, "root");
    xmlStreamReader.nextTag();

    while (true) {
        if (xmlStreamReader.getEventType() == XMLStreamConstants.START_ELEMENT)
            for (int i = 0; i < xmlStreamReader.getAttributeCount(); i++) {
                System.out.println("getAttributePrefix=" + xmlStreamReader.getAttributePrefix(i));
                System.out.println("getAttributeLocalName=" + xmlStreamReader.getAttributeLocalName(i));
                System.out.println("getAttributeName=" + xmlStreamReader.getAttributeName(i));
                System.out.println("getAttributeNamespace=" + xmlStreamReader.getAttributeNamespace(i));
                System.out.println("getAttributeType=" + xmlStreamReader.getAttributeType(i));
                System.out.println("getAttributeValue=" + xmlStreamReader.getAttributeValue(i));
            }
        xmlStreamReader.next();
    }
} finally {
    xmlStreamReader.close();
}

What I tried instead

Should I better use SAXParserFactory saxParserFactory = WstxSAXParserFactory.newInstance(); instead of XMLInputFactory xmlInputFactory = XMLInputFactory2.newInstance(); as a first step? What is the difference?

However, with this I ran into problems when setting saxParser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage","http://www.w3.org/2007/XMLSchema-versioning"); where the SAXNotSupportedException "The specified schema language is not supported." resulted.

At least there I could use xmlReader.setErrorHandler(new SimpleErrorHandler()); to install an error handler, which I did not do in my code above.

Addon Question1

What is better for me: createXMLStreamReader or createXMLEventReader ?

Addon Question2

Do I need to adjust my XSD/XML files? Especially the headers?

Addon Question3

Do I need to resolve the Xincludes before parsing/validation? If so, how?

Further Context

  • Clearly, the code is in an early stage where I do not bother much about how it ends.
  • I use XML1.1 because I need xml-tags with more than one ID-attribute.
  • I use XInclude because I want to define my xml-files in a modular way to avoid xml-code duplications.
  • Intellij does no validate my files so I am hereby trying to dig a bit deeper but I assume that the problems are unrelated as of now because here I get no validation problem whereas I get one in the other thread
  • I posted (almost the same question) to the Woodstox mailing list but there is almost no activity. thread

A few points:

  1. When you say XML 1.1, I think you mean XSD 1.1.

  2. Your Woodstox code makes no attempt to enable schema validation.

  3. It's confusing (but not actually incorrect) to use a .xsd file extension for files that are ordinary XML instance files, not schema documents.

  4. I don't know if there is any way to enable schema validation (esp. XSD 1.1 validation) with Woodstox. Except by using the Saxon validator, which allows the input to a schema validator to be a StAXSource.

  5. There are two XSD 1.1 processors available in the Java world: Xerces and Saxon. If you're going to use the Xerces schema validator, I think you probably need to use it with the Xerces XML parser (it might be possible to decouple them, but I don't know why you would want to). If you choose the Saxon XSD processor, then it will work with any SAX or StAX parser, but I don't see any benefits in using StAX (ie. Woodstox) because the XSD processor sits on a push pipeline and that makes SAX a better fit.

  6. As regards XInclude processing, I think you probably want to do XInclude processing before XSD validation? That's easy enough when you use Xerces as the XML parser and Saxon as the schema validator. It might also be possible with other product combinations, I don't know for certain.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM