[英]Parsing CDATA in xml with python
我需要解析一个XML文件,其中包含一些CDATA块,我需要保留这些块以供以后绘图:
<process id="process1"> <log name="name1" device="device1"><![CDATA[timestamp value]]]></log> <log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]]></log> </process>
我需要反复而迅速地做到这一点,我正在寻找最好的方法来做到这一点。 我已经读过ElementTree是方法中比较快的,但我对其他建议持开放态度。
以下是两个如何操作的示例:
from lxml import etree
import xml.etree.ElementTree as ElementTree
CONTENT = """
<process id="process1">
<log name="name1" device="device1"><![CDATA[timestamp value]]></log>
<log name="name2" device="device2"><![CDATA[timestamp value, timestamp value, timestamp]]></log>
</process>
"""
def parse_with_lxml():
root = etree.fromstring(CONTENT)
for log in root.xpath("//log"):
print log.text
def parse_with_stdlib():
root = ElementTree.fromstring(CONTENT)
for log in root.iter('log'):
print log.text
if __name__ == '__main__':
parse_with_lxml()
parse_with_stdlib()
输出:
timestamp value
timestamp value, timestamp value, timestamp
timestamp value
timestamp value, timestamp value, timestamp
它在两种情况下都处理它的text属性。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.