简体   繁体   English

Python – 如何将 ASCII 字符串转换为 UTF-8?

[英]Python – How do I convert an ASCII string into UTF-8?

I am using a package in python that returns a string using ASCII characters as opposed to unicode (eg. returns 'seré' as opposed to seré).我在 python 中使用 package,它返回一个使用 ASCII 字符的字符串,而不是 unicode(例如,返回 'seré' 而不是 seré)。

Given this is python 3.8, the string is actually encoded in unicode, the package just seems to output it as if it were ASCII. Given this is python 3.8, the string is actually encoded in unicode, the package just seems to output it as if it were ASCII. As such, when I try to perform x.decode('utf-8') or x.encode('ascii'), neither work.因此,当我尝试执行 x.decode('utf-8') 或 x.encode('ascii') 时,都不起作用。 Is there a way to make python treat the string as if it were ASCII, such that I can decode it to unicode?有没有办法让 python 将字符串视为 ASCII,以便我可以将其解码为 unicode? Or is there a package that can serve this purpose.或者是否有可以用于此目的的 package。

I am relatively new to python so I apologise if my explanation is unclear.我对 python 比较陌生,所以如果我的解释不清楚,我深表歉意。 I am happy to clarify things if needed.如果需要,我很乐意澄清事情。

Code代码

from spanishconjugator import Conjugator as c  
verb = c().conjugate('pasar', 'preterite', 'indicative', 'yo')
print(verb)  

This returns the string 'pasé' where it should return 'pasé'.这将返回字符串“pasé”,它应该返回“pasé”。

Update更新

From further searching and from your answers, it appears to be an issue to do with single 2-byte UTF-8 (é) characters being literally interpreted as two 1-byte latin-1 (é) characters (nothing to do with ASCII, my mistake).从进一步搜索和您的答案来看,这似乎与单个 2 字节 UTF-8 (é) 字符被逐字解释为两个 1 字节 latin-1 (é) 字符有关(与 ASCII 无关,我的错误)。

Managed to fix it with:设法修复它:

verb.encode('latin-1').decode('utf-8')

Thank you to those that commented.感谢那些发表评论的人。

If the input string contains the raw byte ordinals (such as \xc3\xa9 / é instead of é ) use latin1 to encode it to bytes verbatim, then decode with the desired encoding.如果输入字符串包含原始字节序号(例如\xc3\xa9 / é而不是é ),请使用latin1将其逐字编码为字节,然后使用所需的编码进行解码

>>> "pasé".encode('latin1').decode()
'pasé'

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM