简体   繁体   English

Python XML文件打开

[英]Python XML File Open

I am trying to open an xml file and parse it, but when I try to open it the file never seems to open at all it just keeps running, any ideas? 我试图打开一个xml文件并解析它,但当我尝试打开它时,文件似乎永远不会打开它只是一直运行,任何想法?

from xml.dom import minidom
Test_file = open('C::/test_file.xml','r')
xmldoc = minidom.parse(Test_file)

Test_file.close()

for i in xmldoc:
     print('test')

The file is 180.288 KB, why does it never make it to the print portion? 该文件是180.288 KB,为什么它永远不会到打印部分?

Running your Python code with a few adjustments: 通过一些调整来运行Python代码:

from xml.dom import minidom
Test_file = open('C:/test_file.xml','r')
xmldoc = minidom.parse(Test_file)

Test_file.close()

def printNode(node):
  print node
  for child in node.childNodes:
       printNode(child)

printNode(xmldoc.documentElement)

With this sample input as test_file.xml: 将此示例输入作为test_file.xml:

<a>
  <b>testing 1</b>
  <c>testing 2</c>
</a>

Yields this output: 产生此输出:

<DOM Element: a at 0xbc56e8>
<DOM Text node "u'\n  '">
<DOM Element: b at 0xbc5788>
<DOM Text node "u'testing 1'">
<DOM Text node "u'\n  '">
<DOM Element: c at 0xbc5828>
<DOM Text node "u'testing 2'">
<DOM Text node "u'\n'">

Notes: 笔记:

  • As @LukeWoodward mentioned, avoid DOM-based libraries for large inputs, however 180K should be fine. 正如@LukeWoodward所提到的,避免使用基于DOM的大型输入库,但180K应该没问题。 For 180M, control may never return from minidom.parse() without running out of memory first (MemoryError). 对于180M,控制可能永远不会从minidom.parse()返回而不会先耗尽内存(MemoryError)。
  • As @alecxe mentioned, you should eliminate the extraneous ':' in the file spec. 正如@alecxe所提到的,你应该在文件规范中消除无关的':'。 You should have seen error output along the lines of IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml' . 你应该看到IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml'行的错误输出IOError: [Errno 22] invalid mode ('r') or filename: 'C::/test_file.xml'
  • As @mzjn mentioned, xml.dom.minidom.Document is not iterable. 正如@mzjn所提到的, xml.dom.minidom.Document是不可迭代的。 You should have seen error output along the lines of TypeError: iteration over non-sequence . 您应该看到TypeError: iteration over non-sequence行的错误输出TypeError: iteration over non-sequence

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM