简体   繁体   English

Python utf8编码问题

[英]Python utf8 encoding problem

I'm working on a Python application and having some problems handling strings.我正在开发一个 Python 应用程序并且在处理字符串时遇到了一些问题。

There is this string "She's Out of My League" (without quotes).有这个字符串“She's Out of My League”(不带引号)。 I stored it in a variable and tried to insert it into an sqlite3 database.我将它存储在一个变量中并尝试将其插入到 sqlite3 数据库中。 But, I get this error:但是,我得到这个错误:

sqlite3.ProgrammingError: You must not use 8-bit bytestrings unless you use a text_factory that can interpret 8-bit bytestrings (like text_factory = str). sqlite3.ProgrammingError:除非您使用可以解释 8 位字节串的 text_factory(如 text_factory = str),否则不得使用 8 位字节串。 It is highly recommended that you instead just switch your application to Unicode strings.强烈建议您将应用程序切换到 Unicode 字符串。

So, I tried to convert the string to unicode.因此,我尝试将字符串转换为 unicode。 I tried both of these:我尝试了这两个:

new_str = unicode(old_str)
new_str = old_str.encode("utf8")

But this gives me another error:但这给了我另一个错误:

UnicodeDecodeError: 'utf8' codec can't decode byte 0x92 in position 49: unexpected code byte UnicodeDecodeError:“utf8”编解码器无法解码 position 49 中的字节 0x92:意外的代码字节

I'm stuck here.我被困在这里。 What am I doing wrong?我究竟做错了什么?

Simple.简单的。 You're assuming that it's UTF-8.您假设它是 UTF-8。

>>> print 'She\x92s Out of My League'.decode('cp1252')
She’s Out of My League

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM