简体   繁体   English

Python UnicodeDecodeError:'utf-8' 无法解码字节 0x81

[英]Python UnicodeDecodeError: 'utf-8' cant decode byte 0x81

I am trying to use arelle to read a zip file of an XBRL filling.我正在尝试使用 arelle 读取XBRL填充的zip文件。

This is done by giving the command:这是通过发出以下命令来完成的:

C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip

I am getting a UnicodeDecodeError我收到UnicodeDecodeError

C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip
[xmlSchema:syntax] Unrecoverable error: 'utf-8' codec can't decode byte 0x81 in
position 11: invalid start byte, 0000002809-0001047469-10-002778-xbrl.zip, impor
ting source element - 0000002809-0001047469-10-002778-xbrl.zip
Traceback (most recent call last):
  File "C:\a\arelle\ModelDocument.py", line 131, in load
    xmlDocument = etree.parse(file,parser=_parser,base_url=filepath)
  File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src\lxml\lxml.etree.c:6
9970)
  File "parser.pxi", line 1770, in lxml.etree._parseDocument (src\lxml\lxml.etre
e.c:102272)
  File "parser.pxi", line 1790, in lxml.etree._parseFilelikeDocument (src\lxml\l
xml.etree.c:102531)
  File "parser.pxi", line 1685, in lxml.etree._parseDocFromFilelike (src\lxml\lx
ml.etree.c:101457)
  File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike
(src\lxml\lxml.etree.c:97084)
  File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDo
c (src\lxml\lxml.etree.c:91290)
  File "parser.pxi", line 679, in lxml.etree._handleParseResult (src\lxml\lxml.e
tree.c:92441)
  File "lxml.etree.pyx", line 327, in lxml.etree._ExceptionContext._raise_if_sto
red (src\lxml\lxml.etree.c:10196)
  File "parser.pxi", line 373, in lxml.etree._FileReaderContext.copyToBuffer (sr
c\lxml\lxml.etree.c:89098)
  File "C:\Python33\lib\codecs.py", line 301, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 11: invalid
 start byte

It has do to something with utf-8 encoding and the character it represents but i cannot figure out what should i do.它与utf-8编码及其代表的字符有关,但我不知道我该怎么做。 I found some guide but didn't help me address the issue.我找到了一些指南,但没有帮助我解决这个问题。

The issue was created because the program demands to parse not the whole Zip folder but a specific file (in this case the instance folder) which lies in the subdirectory of the zip folder.之所以产生这个问题,是因为程序要求解析的不是整个 Zip 文件夹,而是位于 zip 文件夹子目录中的特定文件(在本例中为实例文件夹)。

To access the zip directory:要访问 zip 目录:

If our file inside the zip directory is 1.xml
C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip\1.xml

Verdict:判决:

A UnicodeDecodeError: 'utf-8' cant decode byte 0x81 was caused because of the above reason.由于上述原因导致UnicodeDecodeError: 'utf-8' cant decode byte 0x81

After a quick search up, the misbehaving byte appears to be the Ctrl key, according to the unicode byte database.根据 unicode 字节数据库,快速搜索后,行为不端的字节似乎是Ctrl键。 As the appearance of Ctrl exists only as a haxi number and doesn't have it's own letter, I'm thinking that utf is having trouble printing it as a visible char, so the above error arises.由于Ctrl的外观仅作为 haxi 数字存在并且没有自己的字母,我认为utf无法将其打印为可见字符,因此出现上述错误。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Python 错误:UnicodeDecodeError:“utf-8”编解码器无法解码 position 中的字节 0x81 76:起始字节无效 - Python Error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 76: invalid start byte UnicodeDecodeError:“utf-8”编解码器无法解码 position 76 中的字节 0x81:起始字节无效 - UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 76: invalid start byte pandas csv UnicodeDecodeError: &#39;utf-8&#39; codec can&#39;t decode byte 0x81 in position 162: invalid start byte - pandas csv UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 162: invalid start byte 如何修复:UnicodeDecodeError:“utf-8”编解码器无法解码 position 中的字节 0x81 18:起始字节无效 - How to fix: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 18: invalid start byte UnicodeDecodeError: &#39;charmap&#39; 编解码器无法解码字节 0x81 - UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 Python - UnicodeDecodeError:'charmap' 编解码器无法解码 position 中的字节 0x81 229393:: 字符映射到<undefined></undefined> - Python - UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 229393:: character maps to <undefined> UnicodeDecodeError:cp932编解码器无法解码位置81的字节0x81 - UnicodeDecodeError: cp932 codec can't decode byte 0x81 in position 81 Python编解码器无法解码字节0x81 - Python codec can't decode byte 0x81 Python GNP软件包错误:UnicodeDecodeError:&#39;charmap&#39;编解码器无法解码位置4894的字节0x81:字符映射到<undefined> - Python GNP package error: UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 4894: character maps to <undefined> Python Pandas writer.save()编码错误 - UnicodeDecodeError:&#39;ascii&#39;编解码器无法解码字节0x81 - Python Pandas writer.save() encoding error - UnicodeDecodeError: 'ascii' codec can't decode byte 0x81
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM