简体   繁体   English

使用 Saxon API C# 根据 XSD 验证 XML 不报告所有验证错误

[英]Validate XML against XSD using Saxon API C# not reporting all validation errors

I am trying to validate xml against xsd using saxon api for C#.Net.我正在尝试使用适用于 C#.Net 的 saxon api 针对 xsd 验证 xml。 However, it is not catching all the validation errors in one go.但是,它并不能一次性捕获所有验证错误。 All the elements which violate data type are getting caught, but if the structure is badly formed with multiple errors in it, then it catches only the first error.所有违反数据类型的元素都会被捕获,但如果结构形成错误且其中存在多个错误,则它只会捕获第一个错误。 I have created the sample code as below我创建了如下示例代码

XSD File XSD文件

<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema" targetNamespace="urn:books" xmlns:bks="urn:books">
    <xsd:element name="books" type="bks:BooksForm"/>
    <xsd:complexType name="BooksForm">
        <xsd:sequence>
            <xsd:element name="book" type="bks:BookForm" minOccurs="0" maxOccurs="unbounded"/>
        </xsd:sequence>
    </xsd:complexType>
    <xsd:complexType name="BookForm">
        <xsd:sequence>
            <xsd:element name="author" type="xsd:string"/>
            <xsd:element name="title" type="xsd:string"/>
            <xsd:element name="genre" type="xsd:string"/>
            <xsd:element name="price" type="xsd:float" />
            <xsd:element name="pub_date" type="xsd:date" />
            <xsd:element name="review" type="xsd:string"/>
        </xsd:sequence>
        <xsd:attribute name="id" type="xsd:string"/>
    </xsd:complexType>
</xsd:schema>

Xml File文件

<?xml version="1.0"?>
<x:books xmlns:x="urn:books">
    <book id="bk001">
        <author>Writer</author>
        <title>The First Book</title>
        <genre>Fiction</genre>
        <price>44.95</price>
        <pub_date>2000-10-01</pub_date>
        <review>An amazing story of nothing.</review>
    </book>

    <book id="bk002">
        <author>Poet</author>
        <title>The Poet's First Poem</title>
        <genre>Poem</genre>
        <price>ABC</price>
        <review>Least poetic poems.</review>
    </book>

    <book id="bk003">
        <bad_element_1></bad_element_1>
        <bad_element_2></bad_element_2>
        <author>Writer</author>
        <title>The First Book</title>
        <genre>Fiction</genre>
        <price>ABC</price>
        <pub_date>2000-10-01</pub_date>
        <review>An amazing story of nothing.</review>
    </book>
</x:books>

C# Code C#代码

public void run(string xmlPath, string xsdPath)
        {
            EnterpriseConfiguration conf = new EnterpriseConfiguration();
            conf.setConfigurationProperty(FeatureKeys.LICENSE_FILE_LOCATION, @"C:\saxon\saxon-license.lic");
            Processor processor = new Processor(conf);
            processor.SetProperty("http://saxon.sf.net/feature/timing", "true");
            processor.SetProperty("http://saxon.sf.net/feature/validation-warnings", "false"); //Set to true to suppress the exception
            SchemaManager manager = processor.SchemaManager;
            manager.XsdVersion = "1.1";
            List<Error> errorList = new();
            manager.ErrorReporter = err => errorList.Add(err);
            XmlReader xsdReader = XmlReader.Create(xsdPath);
            try
            {
                manager.Compile(xsdReader);
            }
            catch (Exception e)
            {
                Console.WriteLine(e);
                Console.WriteLine("Schema compilation failed with " + errorList.Count + " errors");
                foreach (Error error in errorList)
                {
                    Console.WriteLine("At line " + error.Location.LineNumber + ": " + error.Message);
                }
                return;
            }
            SchemaValidator validator = manager.NewSchemaValidator();
            XmlReaderSettings xmlReaderSettings = new XmlReaderSettings();
            xmlReaderSettings.ValidationType = ValidationType.Schema;
            xmlReaderSettings.ValidationFlags = XmlSchemaValidationFlags.ProcessInlineSchema | XmlSchemaValidationFlags.ReportValidationWarnings;
            XmlReader xmlReader = XmlReader.Create(xmlPath, xmlReaderSettings);
            Console.WriteLine("Validating input file " + xmlPath);
            List<ValidationFailure> errors = new();
            validator.InvalidityListener = failure => errors.Add(failure);
            XdmDestination psvi = new();
            validator.SetDestination(psvi);
            try
            {
                validator.Validate(xmlReader);
            }
            catch (Exception e)
            {
                Console.WriteLine(e); Console.WriteLine(); Console.WriteLine();
                Console.WriteLine("Instance validation failed with " + errors.Count + " errors"); Console.WriteLine(); Console.WriteLine();
                foreach (ValidationFailure error in errors)
                {
                    Console.WriteLine("At line " + error.LineNumber + ": " + error.Message); Console.WriteLine(); Console.WriteLine();
                }
                return;
            }
            Console.WriteLine("Input file is valid");
        }

Output This image shows the console screen输出此图像显示控制台屏幕

The highlighted validation error in this image is not caught in the output此图像中突出显示的验证错误未在输出中捕获

Please help.请帮忙。 Thanks.谢谢。

I am expecting to catch all the validation errors in one go.我希望一次性捕获所有验证错误。

Generally if the content of an element (the sequence of children) doesn't match the content model defined for that element in the schema, Saxon regards that as one validation error, and doesn't attempt further validation until it gets to the end of the invalid element.通常,如果元素的内容(子元素的序列)与架构中为该元素定义的内容模型不匹配,Saxon 会将其视为一个验证错误,并且不会尝试进一步验证,直到它到达末尾无效元素。

Error recovery from parsing errors is a bit of an art and there's no universal solution;从解析错误中恢复错误是一门艺术,没有通用的解决方案; the one thing people hate is when it's done incorrectly leading to hundreds of spurious errors.人们讨厌的一件事是错误地完成它会导致数百个虚假错误。

In your particular example you're highlighting that there's no error reported for bad_element2.在您的特定示例中,您强调没有报告 bad_element2 的错误。 But how is the processor supposed to know what can validly follow a bad_element1?但是处理器应该如何知道什么可以有效地跟随 bad_element1? The schema doesn't say.架构没有说。 You've already departed from the rule book, the processor can't find a rule to apply here.您已经偏离了规则手册,处理者无法在此处找到要应用的规则。

You could adopt the approach that the schema doesn't allow anything after a bad_element1, and therefore anything that follows is another error.您可以采用架构在 bad_element1 之后不允许任何内容的方法,因此后面的任何内容都是另一个错误。 But that would lead to lots of spurious errors that you don't want.但这会导致许多您不想要的虚假错误。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM