[英]Python utf8 encoding problem
I'm working on a Python application and having some problems handling strings.我正在开发一个 Python 应用程序并且在处理字符串时遇到了一些问题。
There is this string "She's Out of My League" (without quotes).有这个字符串“She's Out of My League”(不带引号)。 I stored it in a variable and tried to insert it into an sqlite3 database.我将它存储在一个变量中并尝试将其插入到 sqlite3 数据库中。 But, I get this error:但是,我得到这个错误:
sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). sqlite3.ProgrammingError:除非您使用可以解释 8 位字节串的 text_factory(如 text_factory = str),否则不得使用 8 位字节串。 It is highly recommended that you instead just switch your application to Unicode strings.强烈建议您将应用程序切换到 Unicode 字符串。
So, I tried to convert the string to unicode.因此,我尝试将字符串转换为 unicode。 I tried both of these:我尝试了这两个:
new_str = unicode(old_str)
new_str = old_str.encode("utf8")
But this gives me another error:但这给了我另一个错误:
UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 49: unexpected code byte UnicodeDecodeError:“utf8”编解码器无法解码 position 49 中的字节 0x92:意外的代码字节
I'm stuck here.我被困在这里。 What am I doing wrong?我究竟做错了什么?
Simple.简单的。 You're assuming that it's UTF-8.您假设它是 UTF-8。
>>> print 'She\x92s Out of My League'.decode('cp1252')
She’s Out of My League
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.