简体   繁体   English

如何在使用python xml.etree.ElementTree生成xml文件时修复unicode错误而又不丢失任何数据?

[英]How to fix unicode error while generating xml file using python xml.etree.ElementTree without misssing any data?

I am generating xml file using xml.etree.ElementTree in python and then writting the generated xml file into a file. 我正在使用python中的xml.etree.ElementTree生成xml文件,然后将生成的xml文件写入文件中。 One of the tag of the xml is regarding system installed software which will have all the details of the system installed software. xml的标签之一是有关系统安装的软件的信息,其中将包含系统安装的软件的所有详细信息。 The xml output on console which i get on execution of the script is perfect but when i try to place output into a file i encounter following error as: 我在脚本执行时获得的控制台上的xml输出是完美的,但是当我尝试将输出放置到文件中时,遇到以下错误:

Traceback (most recent call last):
File "C:\xmltry.py", line 65, in <module>
f.write(prettify(top))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf1' in position 4305: ordinal not in range(128)

Following is the script: 以下是脚本:

def prettify(elem):
"""Return a pretty-printed XML string for the Element.
"""
rough_string = ElementTree.tostring(elem, 'utf-8')
reparsed = minidom.parseString(rough_string)
return reparsed.toprettyxml(indent="  ")

##Here starts populating elements inside xml file
top = Element('my_practice_document')
comment = Comment('details')
top.append(comment)
child = SubElement(top, 'my_information')
childs = SubElement(child,'my_name')
childs.text = str(options.my_name)

#Following section is for retrieving list of software installed on the system
import wmi
w = wmi.WMI()
for p in w.Win32_Product():
if (p.Version is not None) and (p.Caption is not None):
    child = SubElement(top, 'sys_info') 
    child.text =  p.Caption + " version "+ p.Version 

## Following portion places the xml output into test.xml file
with open("test.xml", 'w') as f:
    f.write(prettify(top))

When the script is executed i get the unicode error . 执行脚本后,出现unicode错误。 I searched on internet and tried out following also: 我在互联网上搜索并尝试了以下方法:

import sys
reload(sys)
sys.setdefaultencoding('utf8')

But this also did not resolved my issue. 但这也没有解决我的问题。 I want to have all the data which i am getting on console into the file without missing anything. 我希望将控制台上获取的所有数据都放入文件中,而不会丢失任何内容。 So, how can i achieve that. 所以,我怎么能做到这一点。 Thanx in advance for your assistance. 提前感谢您的协助。

You need to specify an encoding for your output file; 您需要为输出文件指定编码。 sys.setdefaultencoding doesn't do that for you. sys.setdefaultencoding不会为您执行此操作。

Try 尝试

import codecs
with codecs.open("test.xml", 'w', encoding='utf-8') as f:
    f.write(prettify(top))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM