简体   繁体   中英

bangla encoding in Python - UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

How do i store the content "আপনার" as UTF-8 "আপনার"? I have tried the following:

>>> content = "আপনার"
>>> content
'\xe0\xa6\x86\xe0\xa6\xaa\xe0\xa6\xa8\xe0\xa6\xbe\xe0\xa6\xb0'

>>> content = "আপনার".encode("UTF-8")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe0 in position 0: ordinal not in range(128)

>>> content = "আপনার".decode("UTF-8")
>>> content
u'\u0986\u09aa\u09a8\u09be\u09b0'

The second one works, but you have to use print content instead of content :

>>> content = "আপনার".decode("UTF-8")
>>> print content
আপনার

__str__ and __repr__

This is the difference between a str and __repr__ formats of an object. The first is meant to be human-readable, the second is meant to expose internals and be unique to the object. You can read more in Difference between __str__ and __repr__ in Python .

String representation

>>> print unicode(content)
আপনার

__repr__ representation

>>> print content.__repr__()
u'\u0986\u09aa\u09a8\u09be\u09b0'

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM