简体   繁体   中英

UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 17710: ordinal not in range(128)

I'm trying to print a string from an archived web crawl , but when I do I get this error:

print page['html']
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 17710: ordinal not in range(128)

When I try print unicode(page['html']) I get:

print unicode(page['html'],errors='ignore')
TypeError: decoding Unicode is not supported

Any idea how I can properly code this string, or at least get it to print? Thanks.

You need to encode the unicode you saved to display it, not decode it -- unicode is the unencoded form. You should always specify an encoding, so that your code will be portable. The "usual" pick is utf-8 :

print page['html'].encode('utf-8')

If you don't specify an encoding, whether or not it works will depend on what you're print ing to -- your editor, OS, terminal program, etc.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM