简体   繁体   English

lxml XSLT在处理XML时删除CDATA

[英]lxml XSLT removes CDATA while processing XML

Handling CDATA with lxml involves making parser with suitable declaration, but how about XSLT? 使用lxml处理CDATA涉及使解析器具有适当的声明,但XSLT如何? For example: 例如:

from lxml import etree

parser = etree.XMLParser(strip_cdata=False)
tree = etree.parse('sample_with_cdata.xml', parser)
transform = etree.XSLT(etree.parse('dupe.xsl'))
xml_out = transform(tree)
xml_out.write('processed.xml')

If I process xml file with CDATA through lxml XSLT processor, all CDATA is stripped. 如果我通过lxml XSLT处理器处理带有CDATA的xml文件,则会剥离所有CDATA。 How can I tell XSLT processor to leave CDATA as is? 如何告诉XSLT处理器按原样保留CDATA?

PS. PS。 FYI, adding same parser to etree.XSLT doesn't change outcome 仅供参考,在etree.XSLT中添加相同的解析器不会改变结果

This doesn't seem to be related to lxml. 这似乎与lxml无关。 It's my lack of knowledge... 这是我缺乏知识......

CDATA in XSLT should be handled with "cdata-section-elements" attribute in output declaration. 应在输出声明中使用“cdata-section-elements”属性处理XSLT中的CDATA。 For example, if description element in XML file contains CDATA: 例如,如果XML文件中的description元素包含CDATA:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" cdata-section-elements='description' />
...

As far as XSLT is concerned, CDATA sections in XML are just noise. 就XSLT而言,XML中的CDATA部分只是噪声。 XSLT treats <![CDATA["]]> the same as &quot; which it treats the same as " ; XSLT将<![CDATA["]]>视为&quot;它视为相同"的相同; they are different ways for the document author to write the same thing. 它们是文档作者编写相同内容的不同方式。

If you are using CDATA sections in your input to convey information, that is if <![CDATA[xxx]]> means something different from xxx , then you need to change your XML design. 如果您在输入中使用CDATA部分来传达信息,那么如果<![CDATA[xxx]]>表示与xxx不同的内容,那么您需要更改XML设计。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM