简体繁体 English

Unicode解码错误：如何跳过无效字符

[英]unicode decode error: how to skip invalid characters

原文 2014-12-12 23:47:54 8 2 python

有什么方法可以预处理文本文件并跳过这些字符？

UnicodeDecodeError: 'utf8' codec can't decode byte 0xa1 in position 1395: invalid start byte

2 个解决方案

尝试这个：

str.decode('utf-8',errors='ignore')

I think your text file have some special character, so 'utf-8' can't decode. 我认为您的文本文件具有一些特殊字符，因此'utf-8'无法解码。

You need to try using 'ISO-8859-1' instead of 'utf-8'. 您需要尝试使用“ ISO-8859-1”而不是“ utf-8”。 like this: 像这样：

   import sys
   reload(sys).setdefaultencoding("ISO-8859-1")

   # put your code here

如何解码转义的 Unicode 个字符？ - How to decode escaped Unicode characters?

如何在 Pandas DataFrame 中转换或解码 Unicode 字符？ - How to convert or decode the Unicode characters in pandas DataFrame?

Python如何使用十六进制字符解码unicode - Python how to decode unicode with hex characters

在 Python 中如何对 ö 等 unicode 字符进行编码/解码 - In Python how to encode/decode unicode characters such as ö

如何在 Python 中的字符串中替换无效的 unicode 字符？ - How to replace invalid unicode characters in a string in Python?

如何通过python解码unicode字符？ - How do I decode unicode characters via python?

Weasyprint的Unicode解码错误 - unicode decode error for weasyprint

Python中的Unicode解码错误 - Unicode Decode Error in Python

正则表达式匹配无效的Unicode字符 - Regex match invalid Unicode characters

将Unicode字符解码并编码为'\\ u ####' - Decode and encode unicode characters as '\u####'

暂无

暂无

声明:本站的技术帖子网页，遵循CC BY-SA 4.0协议，如果您需要转载，请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何解码转义的 Unicode 个字符？ - How to decode escaped Unicode characters? 如何在 Pandas DataFrame 中转换或解码 Unicode 字符？ - How to convert or decode the Unicode characters in pandas DataFrame? Python如何使用十六进制字符解码unicode - Python how to decode unicode with hex characters 在 Python 中如何对 ö 等 unicode 字符进行编码/解码 - In Python how to encode/decode unicode characters such as ö 如何在 Python 中的字符串中替换无效的 unicode 字符？ - How to replace invalid unicode characters in a string in Python? 如何通过python解码unicode字符？ - How do I decode unicode characters via python? Weasyprint的Unicode解码错误 - unicode decode error for weasyprint Python中的Unicode解码错误 - Unicode Decode Error in Python 正则表达式匹配无效的Unicode字符 - Regex match invalid Unicode characters 将Unicode字符解码并编码为'\\ u ####' - Decode and encode unicode characters as '\u####'

相关标签

粤ICP备18138465号 © 2020-2024 STACKOOM.COM