简体   繁体   中英

Python: How to handle UnicodeEncodeError?

This is what I am seeing:

Traceback (most recent call last):
  File "/home/user/tools/executeJobs.py", line 86, in <module>
    owner = re.sub('^(AS[0-9]+ )', '', str(element[2]))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 13: ordinal not in range(128)

In the error line you already see the line in question. str(array[0]) never failed me before. How to work around this? A quick and dirty solution is fine.

Update:

Element[2] comes from this binary .dat list:http://github.com/maxmind/geoip-api-php/blob/master/tests/data/ … also avail here: http://dev.maxmind.com/geoip/legacy/geolite (The IP/ASN one at the bottom of the table)

\\xe7 appears to be the circumflex c ç in latin1 encoding

so assuming you have a unicode string u"\\xe7".encode("latin1") should give you the bytestring "\\xe7" , you could also choose to encode it as "utf8" u"\\xe7".encode("utf8") would give you the bytestring "\\xc3\\xa7" ... that may or may not fix your issues however. but it will definately give you a different error

for a quick and dirty solution

try:
    owner = re.sub('^(AS[0-9]+ )', '', element[2])
except TypeError as e:
    print "Weird:",element

I've always used

s.replace(u'\xa0',' ')

In your case, it should look something like

s.replace(u'\xe7','whatever')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM