[英]How to write XML declaration using xml.etree.ElementTree
I am generating an XML document in Python using an ElementTree
, but the tostring
function doesn't include an XML declaration when converting to plaintext.我正在使用
ElementTree
在 Python 中生成一个 XML 文档,但是tostring
函数在转换为纯文本时不包含XML 声明。
from xml.etree.ElementTree import Element, tostring
document = Element('outer')
node = SubElement(document, 'inner')
node.NewValue = 1
print tostring(document) # Outputs "<outer><inner /></outer>"
I need my string to include the following XML declaration:我需要我的字符串包含以下 XML 声明:
<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
However, there does not seem to be any documented way of doing this.但是,似乎没有任何记录的方法可以做到这一点。
Is there a proper method for rendering the XML declaration in an ElementTree
?是否有在
ElementTree
呈现 XML 声明的正确方法?
I am surprised to find that there doesn't seem to be a way with ElementTree.tostring()
.我很惊讶地发现
ElementTree.tostring()
似乎没有办法。 You can however use ElementTree.ElementTree.write()
to write your XML document to a fake file:但是,您可以使用
ElementTree.ElementTree.write()
将您的 XML 文档写入一个假文件:
from io import BytesIO
from xml.etree import ElementTree as ET
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)
f = BytesIO()
et.write(f, encoding='utf-8', xml_declaration=True)
print(f.getvalue()) # your XML file, encoded as UTF-8
See this question .看到这个问题。 Even then, I don't think you can get your 'standalone' attribute without writing prepending it yourself.
即便如此,我不认为您可以在不自己编写的情况下获得“独立”属性。
I would use lxml (see http://lxml.de/api.html ).我会使用 lxml (见http://lxml.de/api.html )。
Then you can:然后你可以:
from lxml import etree
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, xml_declaration=True))
If you include the encoding='utf8'
, you will get an XML header :如果您包含
encoding='utf8'
,您将获得一个 XML 标头:
xml.etree.ElementTree.tostring writes a XML encoding declaration with encoding='utf8'
xml.etree.ElementTree.tostring 使用 encoding='utf8' 编写 XML 编码声明
Sample Python code (works with Python 2 and 3):示例 Python 代码(适用于 Python 2 和 3):
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(
ElementTree.fromstring('<xml><test>123</test></xml>')
)
root = tree.getroot()
print('without:')
print(ElementTree.tostring(root, method='xml'))
print('')
print('with:')
print(ElementTree.tostring(root, encoding='utf8', method='xml'))
Python 2 output: Python 2 输出:
$ python2 example.py
without:
<xml><test>123</test></xml>
with:
<?xml version='1.0' encoding='utf8'?>
<xml><test>123</test></xml>
With Python 3 you will note the b
prefix indicating byte literals are returned (just like with Python 2):在 Python 3 中,您会注意到
b
前缀表示返回了字节文字(就像 Python 2 一样):
$ python3 example.py
without:
b'<xml><test>123</test></xml>'
with:
b"<?xml version='1.0' encoding='utf8'?>\n<xml><test>123</test></xml>"
Is there a proper method for rendering the XML declaration in an ElementTree?
是否有在 ElementTree 中呈现 XML 声明的正确方法?
YES, and there is no need of using .tostring
function.是的,不需要使用
.tostring
函数。 According to ElementTree Documentation , you should create an ElementTree object , create Element and SubElements, set the tree's root , and finally use xml_declaration
argument in .write
function, so the declaration line is included in output file.根据ElementTree Documentation ,您应该创建一个 ElementTree 对象,创建 Element 和 SubElements,设置树的根,最后在
.write
函数中使用xml_declaration
参数,因此声明行包含在输出文件中。
You can do it this way:你可以这样做:
import xml.etree.ElementTree as ET
tree = ET.ElementTree("tree")
document = ET.Element("outer")
node1 = ET.SubElement(document, "inner")
node1.text = "text"
tree._setroot(document)
tree.write("./output.xml", encoding = "UTF-8", xml_declaration = True)
And the output file is:输出文件是:
<?xml version='1.0' encoding='UTF-8'?>
<outer><inner>text</inner></outer>
I encounter this issue recently, after some digging of the code, I found the following code snippet is definition of function ElementTree.write
我最近遇到这个问题,经过一些代码挖掘,我发现以下代码片段是函数
ElementTree.write
定义
def write(self, file, encoding="us-ascii"):
assert self._root is not None
if not hasattr(file, "write"):
file = open(file, "wb")
if not encoding:
encoding = "us-ascii"
elif encoding != "utf-8" and encoding != "us-ascii":
file.write("<?xml version='1.0' encoding='%s'?>\n" %
encoding)
self._write(file, self._root, encoding, {})
So the answer is, if you need write the XML header to your file, set the encoding
argument other than utf-8
or us-ascii
, eg UTF-8
所以答案是,如果您需要将 XML 标头写入文件,请设置
utf-8
或us-ascii
以外的encoding
参数,例如UTF-8
The minimal working example with ElementTree
package usage:使用
ElementTree
包的最小工作示例:
import xml.etree.ElementTree as ET
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
node.text = '1'
res = ET.tostring(document, encoding='utf8', method='xml').decode()
print(res)
the output is:输出是:
<?xml version='1.0' encoding='utf8'?>
<outer><inner>1</inner></outer>
Another pretty simple option is to concatenate the desired header to the string of xml like this:另一个非常简单的选择是将所需的标头连接到 xml 字符串,如下所示:
xml = (bytes('<?xml version="1.0" encoding="UTF-8"?>\n', encoding='utf-8') + ET.tostring(root))
xml = xml.decode('utf-8')
with open('invoice.xml', 'w+') as f:
f.write(xml)
Easy简单
Sample for both Python 2 and 3 ( encoding parameter must be utf8 ): Python 2 和 3 的示例(编码参数必须是utf8 ):
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='utf8', method='xml'))
From Python 3.8 there is xml_declaration parameter for that stuff:从 Python 3.8开始,这些东西有xml_declaration参数:
New in version 3.8: The xml_declaration and default_namespace parameters.
3.8 版新功能:xml_declaration 和 default_namespace 参数。
xml.etree.ElementTree.tostring(element, encoding="us-ascii", method="xml", *, xml_declaration=None, default_namespace=None, short_empty_elements=True) Generates a string representation of an XML element, including all subelements.
xml.etree.ElementTree.tostring(element, encoding="us-ascii", method="xml", *, xml_declaration=None, default_namespace=None, short_empty_elements=True) 生成 XML 元素的字符串表示,包括所有子元素. element is an Element instance.
element 是一个 Element 实例。 encoding 1 is the output encoding (default is US-ASCII).
encoding 1 是输出编码(默认为 US-ASCII)。 Use encoding="unicode" to generate a Unicode string (otherwise, a bytestring is generated).
使用 encoding="unicode" 生成 Unicode 字符串(否则生成字节字符串)。 method is either "xml", "html" or "text" (default is "xml").
方法是“xml”、“html”或“text”(默认为“xml”)。 xml_declaration, default_namespace and short_empty_elements has the same meaning as in ElementTree.write().
xml_declaration、default_namespace 和 short_empty_elements 的含义与 ElementTree.write() 中的含义相同。 Returns an (optionally) encoded string containing the XML data.
返回包含 XML 数据的(可选)编码字符串。
Sample for Python 3.8 and higher: Python 3.8 及更高版本的示例:
import xml.etree.ElementTree as ElementTree
tree = ElementTree.ElementTree(ElementTree.fromstring('<xml><test>123</test></xml>'))
root = tree.getroot()
print(ElementTree.tostring(root, encoding='unicode', method='xml', xml_declaration=True))
try:
from lxml import etree
print("running with lxml.etree")
except ImportError:
try:
# Python 2.5
import xml.etree.cElementTree as etree
print("running with cElementTree on Python 2.5+")
except ImportError:
try:
# Python 2.5
import xml.etree.ElementTree as etree
print("running with ElementTree on Python 2.5+")
except ImportError:
try:
# normal cElementTree install
import cElementTree as etree
print("running with cElementTree")
except ImportError:
try:
# normal ElementTree install
import elementtree.ElementTree as etree
print("running with ElementTree")
except ImportError:
print("Failed to import ElementTree from any known place")
document = etree.Element('outer')
node = etree.SubElement(document, 'inner')
print(etree.tostring(document, encoding='UTF-8', xml_declaration=True))
This works if you just want to print.如果您只想打印,这有效。 Getting an error when I try to send it to a file...
当我尝试将其发送到文件时出现错误...
import xml.dom.minidom as minidom
import xml.etree.ElementTree as ET
from xml.etree.ElementTree import Element, SubElement, Comment, tostring
def prettify(elem):
rough_string = ET.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent=" ")
I didn't found any alternative for adding the standalone
argument in the documentation so I adapted the ET.tosting
function to take it as an argument.我没有找到在文档中添加
standalone
参数的任何替代方法,因此我调整了ET.tosting
函数以将其作为参数。
from xml.etree import ElementTree as ET
# Sample
document = ET.Element('outer')
node = ET.SubElement(document, 'inner')
et = ET.ElementTree(document)
# Function that you need
def tostring(element, declaration, encoding=None, method=None,):
class dummy:
pass
data = []
data.append(declaration+"\n")
file = dummy()
file.write = data.append
ET.ElementTree(element).write(file, encoding, method=method)
return "".join(data)
# Working example
xdec = """<?xml version="1.0" encoding="UTF-8" standalone="no" ?>"""
xml = tostring(document, encoding='utf-8', declaration=xdec)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.