简体   繁体   English

python:检查XSD xml模式

[英]python: examine XSD xml schema

I would like to examine a XSD schema in python. 我想检查python中的XSD模式。 Currently I'm using lxml which is doing it's job very very well when it only has to validate a document against the schema. 当前,我正在使用lxml,当它只需要根据模式验证文档时,它的工作就非常好。 But, I want to know what's inside of the schema and access the elements in the lxml behavior. 但是,我想知道架构内部是什么,并访问lxml行为中的元素。

The schema: 模式:

<?xml version="1.0"?>
<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema">
    <xsd:include schemaLocation="worker_remote_base.xsd"/>
    <xsd:include schemaLocation="transactions_worker_responses.xsd"/>
    <xsd:include schemaLocation="transactions_worker_requests.xsd"/>
</xsd:schema>

The lxml code to load the schema is (simplyfied): 加载架构的lxml代码(简化):

xsd_file_handle = open( self._xsd_file, 'rb')
xsd_text        = xsd_file_handle.read()
schema_document   = etree.fromstring(xsd_text, base_url=xmlpath)
xmlschema         = etree.XMLSchema(schema_document)

I'm then able to use schema_document (which is etree._Element ) to go through the schema as an XML document. 然后,我可以使用schema_document (即etree._Element )将其作为XML文档进行遍历。 But since etree.fromstring (at least it seems like that) expects a XML document the xsd:include elements are not processed. 但是由于etree.fromstring (至少看起来像这样)期望XML文档,因此不会处理xsd:include元素。

The problem is currently solved by parsing the first schema document, then load the include elements and then insert them one by one into the main document by hand: 当前,该问题的解决方法是解析第一个架构文档,然后加载include元素,然后手动将它们一个个地插入到主文档中:

BASE_URL            = "/xml/"
schema_document     = etree.fromstring(xsd_text, base_url=BASE_URL)
tree                = schema_document.getroottree()

schemas             = []
for schemaChild in schema_document.iterchildren():
    if schemaChild.tag.endswith("include"):
        try:
            h = open (os.path.join(BASE_URL, schemaChild.get("schemaLocation")), "r")
            s = etree.fromstring(h.read(), base_url=BASE_URL)
            schemas.append(s)
        except Exception as ex:
            print "failed to load schema: %s" % ex
        finally:
            h.close()
        # remove the <xsd:include ...> element
        self._schema_document.remove(schemaChild)

for s in schemas:
# inside <schema>
    for sChild in s:
        schema_document.append(sChild)

What I'm asking for is an idea how to solve the problem by using a more common way. 我要的是一个想法,如何使用更常见的方式解决问题。 I've already searched for other schema parsers in python but for now there was nothing that would fit in that case. 我已经在python中搜索了其他模式解析器,但是现在没有任何适合这种情况的了。

Greetings, 问候,

PyXB can process xsd:include. PyXB可以处理xsd:include。 I used PyXB for Amazon.com's huge product schema files where included file includes further xsd files at multiple levels. 我将PyXB用于Amazon.com的巨大产品架构文件,其中包含的文件包括多个级别的其他xsd文件。 Highly recommended. 强烈推荐。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM