繁体   English   中英

针对模式的XML(.xsd)提要验证

[英]XML (.xsd) feed validation against a schema

我有一个XML文件,我有一个XML模式。 我想根据该模式验证该文件,并检查它是否符合该模式。 我正在使用python,但如果在python中没有这样有用的库,我会对任何语言开放。

这里最好的选择是什么? 我担心我能以多快的速度运行它。

绝对是lxml

使用预定义模式定义XMLParser ,从文件中加载fromstring()并捕获任何XML Schema错误:

from lxml import etree

def validate(xmlparser, xmlfilename):
    try:
        with open(xmlfilename, 'r') as f:
            etree.fromstring(f.read(), xmlparser) 
        return True
    except etree.XMLSchemaError:
        return False

schema_file = 'schema.xsd'
with open(schema_file, 'r') as f:
    schema_root = etree.XML(f.read())

schema = etree.XMLSchema(schema_root)
xmlparser = etree.XMLParser(schema=schema)

filenames = ['input1.xml', 'input2.xml', 'input3.xml']
for filename in filenames:
    if validate(xmlparser, filename):
        print("%s validates" % filename)
    else:
        print("%s doesn't validate" % filename)

注意编码

如果模式文件包含带编码的xml标记(例如<?xml version="1.0" encoding="UTF-8"?> ),则上面的代码将生成以下错误:

Traceback (most recent call last):
  File "<input>", line 2, in <module>
    schema_root = etree.XML(f.read())
  File "src/lxml/etree.pyx", line 3192, in lxml.etree.XML
  File "src/lxml/parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

解决方案是以字节模式打开文件: open(..., 'rb')

[...]
def validate(xmlparser, xmlfilename):
    try:
        with open(xmlfilename, 'rb') as f:
[...]
with open(schema_file, 'rb') as f:
[...]

python片段很好,但另一种方法是使用xmllint:

xmllint -schema sample.xsd --noout sample.xml

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM