[英]What's the best way to handle XML in Python?
I am working on an XML file that is very large (I think that since about it is 45 GB) I have to search through the document of OpenStreetMap data and find a particular path that satisfies a certain criteria. 我正在处理一个非常大的XML文件(我认为既然是45 GB),我必须搜索OpenStreetMap数据的文档并找到满足特定条件的特定路径。 I am currently using XML element tree however I think it is a little slow since I need to search the XML for specific Geographic coordinates. 我目前正在使用XML元素树,但是我认为这有点慢,因为我需要在XML中搜索特定的地理坐标。 Is there a better way to do this ? 有一个更好的方法吗 ? I have also seen a little bit of lxml however, I'd like to know if there's a better choice between the two ? 我也看过一点lxml,但是我想知道两者之间是否有更好的选择? Thank you! 谢谢!
The sax package is well suited for parsing huge xml
files that can't be loaded into memory all at once. sax软件包非常适合解析无法一次全部加载到内存的巨大xml
文件。 The parser will go "line-by-line" and notify you when it encountered the beginning or ending of an element, instead of loading the file into memory, parsing it, and giving you the whole tree. 解析器将“逐行”发送,并在遇到元素的开头或结尾时通知您,而不是将文件加载到内存中,进行解析并为您提供整棵树。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.