[英]No DTD validation and XInclude resolution when using Saxon C HE with Python
I have a question about the Saxon C HE version for Python. After the successful installation I tried some examples where I executed XSLT transformations.我对 Python 的 Saxon C HE 版本有疑问。成功安装后,我尝试了一些示例,其中我执行了 XSLT 转换。 These all worked.
这些都奏效了。
However, when I parse an XML file, no DTD validation is performed during parsing and the XIncludes are not resolved.但是,当我解析一个 XML 文件时,在解析过程中没有执行 DTD 验证,也没有解析 XIncludes。 I have tried many things, however it is not possible for me to solve this problem.
我已经尝试了很多东西,但是我不可能解决这个问题。 I hope someone can show me and explain my error.
我希望有人能告诉我并解释我的错误。
Attached is an example which should show an error with intent when a DTD validation is done because there is no element with the name FOU in the DTD.附件是一个示例,当 DTD 验证完成时应该显示意图错误,因为 DTD 中没有名称为 FOU 的元素。 When I run the script then it creates a Result.xml file and both the erroneous FOU element is present and the XInclude which is not resolved.
当我运行脚本时,它会创建一个 Result.xml 文件,并且存在错误的 FOU 元素和未解析的 XInclude。
I am aware that it is easy to do this with lxml, however I would like to know how it works with the Saxon parser.我知道使用 lxml 很容易做到这一点,但我想知道它如何与 Saxon 解析器一起工作。
XML Master: XML 师傅:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE TEST SYSTEM "Test.dtd">
<TEST>
<FOU Id="A-1">
<BAR Name="Test-Bar-1"/>
<BAR Name="Test-Bar-2"/>
<BAR Name="Test-Bar-3"/>
</FOU>
<TUTU Id="TU-1">
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="Include.xml" xpointer="xpointer(/node()/node()/*)"/>
</TUTU>
</TEST>
XML Include: XML 包括:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE TEST SYSTEM "Test.dtd">
<TEST>
<TUTU Id="TU-1">
<TITI Name="Titi-1"/>
<TITI Name="Titi-2"/>
<TITI Name="Titi-3"/>
</TUTU>
</TEST>
DTD: DTD:
<!ELEMENT TEST (FOO+ , TUTU+)>
<!ELEMENT FOO (BAR+)>
<!ELEMENT BAR ANY>
<!ELEMENT TUTU (TITI+)>
<!ELEMENT TITI ANY>
<!-- Attribute -->
<!ATTLIST TEST
>
<!ATTLIST FOO
Id ID #REQUIRED
>
<!ATTLIST BAR
Name CDATA #IMPLIED
>
<!ATTLIST TUTU
Id ID #REQUIRED
>
<!ATTLIST TITI
Name CDATA #IMPLIED
>
Python Script: Python 脚本:
import saxonc
with saxonc.PySaxonProcessor(license=False) as proc:
print(proc.version)
xdmAtomicval = proc.make_boolean_value(False)
xsltproc = proc.new_xslt_processor()
document = proc.parse_xml(xml_file_name='Master.xml')
print(document)
xsltproc.set_source(xdm_node=document)
xsltproc.set_output_file("Result.xml")
xsltproc.compile_stylesheet(stylesheet_file="styl.xslt")
xsltproc.transform_to_file(stylesheet_file="styl.xslt")
documentRes = proc.parse_xml(xml_file_name='Result.xml')
print(documentRes)
You should be able to set the xi
and dtd
configuration properties to "on".您应该能够将
xi
和dtd
配置属性设置为“on”。
proc.set_configuration_property("xi", "on")
proc.set_configuration_property("dtd", "on")
However, the only way I could get it to work was if I removed the xpointer from the xinclude.但是,唯一能让它工作的方法是从 xinclude 中删除 xpointer。 I didn't have time to research why this isn't working.
我没有时间研究为什么这不起作用。
It also doesn't appear that parse_xml() does any validation or xinclude resolution, but it did happen on the transform (set dtd validation to "off" or to "recover" to get Result.xml). parse_xml() 似乎也没有执行任何验证或 xinclude 解析,但它确实发生在转换上(将 dtd 验证设置为“关闭”或“恢复”以获取 Result.xml)。
Here's the modified version of your Python that I used to test...这是我用来测试的 Python 的修改版本...
import os
import saxonc
with saxonc.PySaxonProcessor(license=False) as proc:
print(proc.version)
proc.set_cwd(os.getcwd())
proc.set_configuration_property("xi", "on")
proc.set_configuration_property("dtd", "on")
document = proc.parse_xml(xml_file_name='Master.xml')
print(document)
xsltproc = proc.new_xslt30_processor()
xsltproc.transform_to_file(source_file="Master.xml", stylesheet_file="styl.xslt", output_file="Result.xml")
documentRes = proc.parse_xml(xml_file_name='Result.xml')
print(documentRes)
The PyDocumentBuilder
class which is new in SaxonC 11 should be able to enable you to do DTD validation. PyDocumentBuilder
11 中新增的 PyDocumentBuilder class 应该能够让您进行 DTD 验证。 See: https://www.saxonica.com/saxon-c/doc11/html/saxonc.html#PyDocumentBuilder You should be able to use the method dtd_validation to set validation.请参阅: https://www.saxonica.com/saxon-c/doc11/html/saxonc.html#PyDocumentBuilder您应该能够使用方法 dtd_validation 来设置验证。
You can create a PyDocumentBuilder as follows:您可以按如下方式创建 PyDocumentBuilder:
proc.new_document_builder
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.