[英]python lxml iterparse fails on large files containing namespaces
I'm tryint to parse large file (>100mb) as described at http://effbot.org/zone/element-iterparse.htm#incremental-parsing 我正在尝试解析大文件(> 100mb),如http://effbot.org/zone/element-iterparse.htm#incremental-parsing中所述
But if file contains namespaces, lxml fails with error 但是,如果文件包含名称空间,则lxml会失败并显示错误
lxml.etree.XMLSyntaxError: Namespace default prefix was not found
It works fine if I remove elem.clear(), but uses a lot of memory. 如果删除elem.clear(),它可以正常工作,但是会占用大量内存。 Example of xml file
xml文件示例
<?xml version="1.0" encoding="utf-8" ?>
<feed xmlns="NS">
<offer>
<type>type1</type>
<name>name1</name>
</offer>
</feed>
lxml version is 3.2.0, because new versions segfaults after end of parsing lxml版本是3.2.0,因为新版本的段错误在解析结束后出现
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.