简体   繁体   中英

Python utf8 encoding problem

I'm working on a Python application and having some problems handling strings.

There is this string "She's Out of My League" (without quotes). I stored it in a variable and tried to insert it into an sqlite3 database. But, I get this error:

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). It is highly recommended that you instead just switch your application to Unicode strings.

So, I tried to convert the string to unicode. I tried both of these:

new_str = unicode(old_str)
new_str = old_str.encode("utf8")

But this gives me another error:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 49: unexpected code byte

I'm stuck here. What am I doing wrong?

Simple. You're assuming that it's UTF-8.

>>> print 'She\x92s Out of My League'.decode('cp1252')
She’s Out of My League

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM