简体   繁体   English

Python:unicode .encode,可以对无法编码的字符调用函数吗?

[英]Python: unicode .encode, can a function be called on unencodable characters?

I have a text, in uncode, that I'd like to encode in latin-1. 我有一个要以latin-1编码的未编码文本。 Some characters cannot be encoded. 某些字符无法编码。 If I use encode with the "replace" parameter, I get the question tag character, but, is there a way to call a custom function to replace the character? 如果我将编码与“ replace”参数一起使用,则会得到问号字符,但是,有没有办法调用自定义函数来替换字符?

For example, I'd like to convert all the possible characters to latin-1, and call unidecode.unidecode() on the unencodable characters. 例如,我想将所有可能的字符转换为latin-1,然后对无法编码的字符调用unidecode.unidecode() Is that possible? 那可能吗?

You can create your own error handler with codecs.register_error('myerrorhandler', function) . 您可以使用codecs.register_error('myerrorhandler', function)创建自己的错误处理程序。

>>> import codecs
>>> codecs.register_error('silly', lambda e: ('X', e.start+1))
>>> 'foöbar'.encode('ascii', 'silly')
b'foXbar'
>>>

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM