[英]XML Parsing with Decode in Python
我试图读取的 XML 文件以b'
开头:
b'<?xml version="1.0" encoding="UTF-8" ?><root><property_id type="dict"><n53987 type="int">54522</n53987><n65731 type="int">66266</n65731><n44322 type="int">44857</n44322><n11633 type="int">12148</n11633><n28192 type="int">28727</n28192><n69053 type="int">69588</n69053><n26529 type="int">27064</n26529><n4844 type="int">4865</n4844><n7625 type="int">7646</n7625><n54697 type="int">55232</n54697><n6210 type="int">6231</n6210><n26710 type="int">27245</n26710><n57915 type="int">58450</n57915
import xml.etree.ElementTree as etree
tree = etree.decode("UTF-8").parse("./property.xml")
我怎样才能解码这个文件? 之后阅读字典类型?
所以你可以试试这个,但这会返回一个元素实例
import ast
import xml.etree.ElementTree as etree
tree = None
with open("property.xml", "r") as xml_file:
f = xml_file.read()
# convert string representation of bytes back to bytes
raw_xml_bytes= ast.literal_eval(f)
# read XML from raw bytes
tree = etree.fromstring(raw_xml_bytes)
另一种方法是读取文件并将其完全转换为字符串文件,然后再次重新读取,这将返回一个 ElementTree 实例。 您可以使用以下方法实现此目的:
tree = None
with open("property.xml", "r") as xml_file:
f = xml_file.read()
# convert string representation of bytes back to bytes
raw_xml_bytes= ast.literal_eval(f)
# save the converted string version of the XML file
with open('output.xml', 'w') as file_obj:
file_obj.write(raw_xml_bytes.decode())
# read saved XML file
with open('output.xml', 'r') as xml_file:
tree = etree.parse(f)
打开和读取 xml 文件将返回字节类型的数据,它具有 a.decode() 方法(参见https://docs.python.org/3/library/stdtypes.html#bytes.decode )。 您可以使用适当的编码名称执行以下操作:
my_xml_text = xml_file.read().decode('utf-8')
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.