简体   繁体   English

在Python中处理XML的最佳方法是什么?

[英]What's the best way to handle XML in Python?

I am working on an XML file that is very large (I think that since about it is 45 GB) I have to search through the document of OpenStreetMap data and find a particular path that satisfies a certain criteria. 我正在处理一个非常大的XML文件(我认为既然是45 GB),我必须搜索OpenStreetMap数据的文档并找到满足特定条件的特定路径。 I am currently using XML element tree however I think it is a little slow since I need to search the XML for specific Geographic coordinates. 我目前正在使用XML元素树,但是我认为这有点慢,因为我需要在XML中搜索特定的地理坐标。 Is there a better way to do this ? 有一个更好的方法吗 ? I have also seen a little bit of lxml however, I'd like to know if there's a better choice between the two ? 我也看过一点lxml,但是我想知道两者之间是否有更好的选择? Thank you! 谢谢!

The sax package is well suited for parsing huge xml files that can't be loaded into memory all at once. sax软件包非常适合解析无法一次全部加载到内存的巨大xml文件。 The parser will go "line-by-line" and notify you when it encountered the beginning or ending of an element, instead of loading the file into memory, parsing it, and giving you the whole tree. 解析器将“逐行”发送,并在遇到元素的开头或结尾时通知您,而不是将文件加载到内存中,进行解析并为您提供整棵树。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM