简体   繁体   中英

Unicode Error when extracting XML file Python

import os, csv, io

from xml.etree import ElementTree
file_name = "example.xml"
full_file = os.path.abspath(os.path.join("xml", file_name))
dom = ElementTree.parse(full_file)
Fruit = dom.findall("Fruit")

with io.open('test.csv','w', encoding='utf8') as fp:
    a = csv.writer(fp, delimiter=',')
    for f in Fruit:
        Explanation = f.findtext("Explanation")
        Types = f.findall("Type")
        for t in Types:
            Type = t.text
            a.writerow([Type, Explanation])    

I am extracting data from a XML file, and put it into a CSV file. I am getting this error message below. It is probably because the extracted data contains a Fahrenheit sign. How could I get rid of these Unicode errors, without fixing it manually the XML file?

For the last line of my code i get this error message UnicodeEncodeError: 'ascii' codec can't encode character u'\\xb0' in position 1267: ordinal not in range(128)

<Fruits>
<Fruit>
    <Family>Citrus</Family>
    <Explanation>They cannot grow at a temperature below 32 °F</Explanation>
    <Type>Orange</Type>
    <Type>Lemon</Type>
    <Type>Lime</Type>
    <Type>Grapefruit</Type>
</Fruit>
</Fruits>

You didn't write, where the error occurs. Probably in the last line. You have to encode the strings yourself:

with open('test.csv','w') as fp:
    a = csv.writer(fp, delimiter=',')
    for f in Fruit:
        explanation = f.findtext("Explanation")
        types = f.findall("Type")
        for t in types:
            a.writerow([t.text.encode('utf8'), explanation.encode('utf8')])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM