Handling CDATA with lxml involves making parser with suitable declaration, but how about XSLT? For example:
from lxml import etree
parser = etree.XMLParser(strip_cdata=False)
tree = etree.parse('sample_with_cdata.xml', parser)
transform = etree.XSLT(etree.parse('dupe.xsl'))
xml_out = transform(tree)
xml_out.write('processed.xml')
If I process xml file with CDATA through lxml XSLT processor, all CDATA is stripped. How can I tell XSLT processor to leave CDATA as is?
PS. FYI, adding same parser to etree.XSLT
doesn't change outcome
This doesn't seem to be related to lxml. It's my lack of knowledge...
CDATA in XSLT should be handled with "cdata-section-elements" attribute in output declaration. For example, if description element in XML file contains CDATA:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" cdata-section-elements='description' />
...
As far as XSLT is concerned, CDATA sections in XML are just noise. XSLT treats <![CDATA["]]>
the same as "
which it treats the same as "
; they are different ways for the document author to write the same thing.
If you are using CDATA sections in your input to convey information, that is if <![CDATA[xxx]]>
means something different from xxx
, then you need to change your XML design.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.