简体   繁体   English

Python 漂亮地打印给定 XML 字符串的 XML

[英]Python pretty print an XML given an XML string

I generated a long and ugly XML string with Python and I need to filter it through pretty printer to look nicer.我用 Python 生成了一个又长又丑的 XML 字符串,我需要通过漂亮的打印机过滤它以看起来更好。

I found this post for python pretty printers, but I have to write the XML string to a file to be read back to use the tools, which I want to avoid if possible.我找到了这篇关于 python 漂亮打印机的帖子,但我必须将 XML 字符串写入一个文件以供读回以使用这些工具,如果可能的话,我想避免这种情况。

What python pretty tools are available that work on strings?有哪些 Python 漂亮的工具可以处理字符串?

Here's how to parse from a text string to the lxml structured data type.以下是如何从文本字符串解析为 lxml 结构化数据类型。

Python 2:蟒蛇2:

from lxml import etree
xml_str = "<parent><child>text</child><child>other text</child></parent>"
root = etree.fromstring(xml_str)
print etree.tostring(root, pretty_print=True)

Python 3:蟒蛇3:

from lxml import etree
xml_str = "<parent><child>text</child><child>other text</child></parent>"
root = etree.fromstring(xml_str)
print(etree.tostring(root, pretty_print=True).decode())

Outputs:输出:

<parent>
  <child>text</child>
  <child>other text</child>
</parent>

I use the lxml library, and there it's as simple as我使用 lxml 库,它就像

>>> print(etree.tostring(root, pretty_print=True))

You can do that operation using any etree , which you can either generate programmatically, or read from a file.您可以使用任何etree执行该操作,您可以以编程方式生成或从文件中读取。

If you're using the DOM from PyXML, it's如果你使用 PyXML 中的 DOM,它是

import xml.dom.ext
xml.dom.ext.PrettyPrint(doc)

That prints to the standard output, unless you specify an alternate stream.除非您指定备用流,否则会打印到标准输出。

http://pyxml.sourceforge.net/topics/howto/node19.html http://pyxml.sourceforge.net/topics/howto/node19.html

To directly use the minidom, you want to use the toprettyxml() function.要直接使用 minidom,您需要使用toprettyxml()函数。

http://docs.python.org/library/xml.dom.minidom.html#xml.dom.minidom.Node.toprettyxml http://docs.python.org/library/xml.dom.minidom.html#xml.dom.minidom.Node.toprettyxml

Here's a Python3 solution that gets rid of the ugly newline issue (tons of whitespace), and it only uses standard libraries unlike most other implementations.这是一个 Python3 解决方案,它摆脱了丑陋的换行符问题(大量空格),并且与大多数其他实现不同,它只使用标准库。 You mention that you have an xml string already so I am going to assume you used xml.dom.minidom.parseString()你提到你已经有一个 xml 字符串,所以我假设你使用了xml.dom.minidom.parseString()

With the following solution you can avoid writing to a file first:使用以下解决方案,您可以避免先写入文件:

import xml.dom.minidom
import os

def pretty_print_xml_given_string(input_string, output_xml):
    """
    Useful for when you are editing xml data on the fly
    """
    xml_string = input_string.toprettyxml()
    xml_string = os.linesep.join([s for s in xml_string.splitlines() if s.strip()]) # remove the weird newline issue
    with open(output_xml, "w") as file_out:
        file_out.write(xml_string)

I found how to fix the common newline issue here .我在这里找到了如何解决常见的换行问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM